Post on 08-Feb-2020
DIAGNOSIS OF ACUTE AND
CHRONIC ENTERIC FEVER
USING METABOLOMICS
Elin Näsström
Department of Chemistry
Umeå 2017
This work is protected by the Swedish Copyright Legislation (Act 1960:729) Dissertation for PhD ISBN: 978-91-7601-773-9 Cover: Marcus Östman. Watercolour, 31x23 cm on Saunders Waterford, 300 g. Electronic version available at: http://umu.diva-portal.org/ Printed by: Lars Åberg, VMC-KBC Umeå Umeå, Sweden 2017
“I am among those who think that science has great beauty”
Marie Curie
i
Table of Contents
Abstract ii
Populärvetenskaplig sammanfattning iii
List of Publications v
Abbreviations and notations vii
1. Aims of this work 1
2. Background 2
2.1. General description of the problem 2 2.2. Enteric fever 3 2.3. Diagnosis 11 2.4. Metabolomics 16
3. Methods 20
3.1. Sample matrices 20 3.2. Sample preparation for metabolomics analysis 21 3.3. Metabolite analysis 23 3.4. Processing of metabolomics data 28 3.5. Multivariate data analysis 32 3.6. Validation and interpretation of metabolomics data 36 3.7. Comparison of metabolites between studies 40
4. Results 42
4.1. Paper I 42 4.2. Paper II 46 4.3. Paper III 52 4.4. Metabolites and metabolite patterns 58
5. Conclusions and Future perspectives 60
6. Acknowledgements 62
7. References 64
ii
Abstract
Enteric (or typhoid) fever is a systemic infection mainly caused by Salmonella
Typhi and Salmonella Paratyphi A. The disease is common in areas with poor
water quality and insufficient sanitation. Humans are the only reservoir for
transmission of the disease. The presence of asymptomatic chronic carriers is a
complicating factor for the transmission. There are major limitations regarding
the current diagnostic methods both for acute infection and chronic carriage.
Metabolomics is a methodology studying metabolites in biological systems
under influence of environmental or physiological perturbations. It has been
applied to study several infectious diseases, with the goal of detecting diagnostic
biomarkers. In this thesis, a mass spectrometry-based metabolomics approach,
including chemometric bioinformatics techniques for data analysis, has been
used to evaluate the potential of metabolite biomarker patterns for diagnosis of
enteric fever at different stages of the disease.
In Paper I, metabolite patterns related to acute enteric fever were investigated.
Human plasma samples from patients in Nepal with culture-confirmed S. Typhi
or S. Paratyphi A infection were compared to afebrile controls. A metabolite
pattern discriminating between acute enteric fever and afebrile controls, as well
as between the two causative agents of enteric fever was detected. The strength
of using a panel of metabolites instead of single metabolites as biomarkers was
also highlighted. In Paper II, metabolite patterns for acute enteric fever, this
time focusing only on S. Typhi infections, were investigated. Human plasma
from patients in Bangladesh with culture-positive or -negative but clinically
suspected S. Typhi infection were compared to febrile controls. Differences were
found in metabolite patterns between the culture-positive S. Typhi group and
the febrile controls with a heterogeneity among the suspected S. Typhi samples.
Consistencies in metabolite patterns were found to the results from Paper I. In
addition, a validation cohort with culture-positive S. Typhi samples and a
control group including patients with malaria and infections caused by other
pathogens was analysed. Differences in metabolite patterns were detected
between S. Typhi samples and all controls as well as between S. Typhi and
malaria. Consistencies in metabolite patterns were found to the primary
Bangladeshi cohort and the Nepali cohort from Paper I. Paper III focused on
chronic Salmonella carriers. Human plasma samples from patients in Nepal
undergoing cholecystectomy with confirmed S. Typhi or S. Paratyphi A
gallbladder carriage were compared to non-carriage controls. The Salmonella
carriage samples were distinguished from the non-carriage controls and
differential signatures were also found between the S. Typhi and S. Paratyphi A
carriage samples. Comparing metabolites found during chronic carriage and
acute enteric fever (in Paper I) resulted in a panel of metabolites significant
only during chronic carriage. This work has contributed to highlight the
potential of using metabolomics as a tool to find diagnostic biomarker patterns
associated with different stages of enteric fever.
iii
Populärvetenskaplig sammanfattning
Enterisk feber (eller tyfoidfeber) är en vanlig infektionssjukdom i länder med
dålig vattenkvalitet och undermåliga sanitära förhållanden. Sjukdomen orsakas
främst av bakterierna Salmonella Typhi och Salmonella Paratyphi A. Till
skillnad från många andra infektionssjukdomar, som malaria och
fågelinfluensa, som sprids via djur, kan enterisk feber endast överföras mellan
människor, vilket i sig skapar möjligheter för att eliminera sjukdomen. En
faktor som dock komplicerar spridningen av sjukdomen är förekomsten av
kroniska bärare, dvs. människor som kan sprida bakterierna utan att själva ha
några symptom av sjukdomen. Symptomen vid enterisk feber är ofta generella
och har likheter med många andra febersjukdomar. Vanliga symptom är feber,
allmänpåverkan, huvudvärk, hosta, muskelvärk, magsmärta och förstoppning
eller diarré. I dagsläget så diagnostiseras akut enterisk feber oftast genom
blododling, dvs. isolering av bakterien från ett blodprov. En stor nackdel med
blododling är att metoden inte alltid hittar bakterien hos personer som har
sjukdomen, på grund av låg och varierade känslighet. Det tar också lång tid att
få resultat och det krävs avancerad mikrobiologisk utrustning för att utföra
odlingen, något som inte alltid finns i områden där sjukdomen är som vanligast.
Det finns även andra diagnosmetoder för enterisk feber. Dock har alla sina
begränsningar. Därför behövs nya, snabba, känsliga och specifika diagnos-
metoder som kan användas i områden med begränsade resurser. Bättre
diagnosmetoder kan leda till snabbare och mer specifik behandling av patienter
vilket kan minska risken för komplikationer och fortsatt spridning av
sjukdomen. I ett större perspektiv kan förbättrade diagnosmetoder ge en
säkrare uppskattning av den totala sjukdomsbördan i specifika regioner.
Förbättrade diagnosmetoder behövs inte bara för akut enterisk feber utan även
för att detektera kroniska bärare för att på så sätt minska spridningen av
sjukdomen.
I den här avhandlingen har metabolomik i kombination med avancerade
dataanalysmetoder använts för att undersöka möjligheten att använda en panel
av metaboliter för att diagnosticera enterisk feber i olika sjukdomsstadier. Den
övergripande hypotesen i denna avhandling är att en kombination av
korrelerade (samverkande) biomarkörer, en så kallad latent biomarkör, har en
starkare diagnostisk potential jämfört med enskilda biomarkörer. Metabolomik
är studien av metaboliter (små kemiska molekyler), som aminosyror, socker och
fettsyror, i biologiska prover (t.ex. blod, urin eller vävnad). Specifikt undersöks
ofta förändringar av metabolitnivåer i ett biologiskt system under någon form av
störning, t.ex. en sjukdom. En sjukdom kan tänkas ge upphov till ett specifikt
metabolitmönster som sedan kan jämföras mot andra sjukdomar. För att mäta
dessa metaboliter har en kombination av kromatografi och masspektrometri
iv
använts för att separera molekyler (baserat på deras kemiska egenskaper) och
sedan detektera och identifiera dem (baserat på deras massa). Dessa mätningar
genererar stora mängder data. Vanligtvis mäts hundratals metaboliter och för
att kunna tolka dessa stora och ofta komplexa data används kemometriska
metoder eller mer specifikt multivariat dataanalys. Multivariat dataanalys tar
hänsyn till alla metaboliter i alla prover samtidigt och skapar en statistisk
modell där relationen mellan de analyserade proverna kan visualiseras och
tolkas i detalj. Baserat på dessa modeller kan sedan latenta biomarkörer hittas.
I den första studien analyserades metabolitmönster relaterade till akut enterisk
feber. Plasmaprover från patienter i Nepal med infektion orsakad av S. Typhi
eller S. Paratyphi A jämfördes mot icke-febrila kontroller. Ett metabolitmönster
hittades som kunde skilja prover från akut enterisk feber från kontroller samt
mellan infektion orsakad av S. Typhi och S. Paratyphi A. Dessutom
demonstrerades styrkan med att använda en kombination av metaboliter istället
för en metabolit som biomarkör.
I den andra studien användes ett liknande tillvägagångssätt för att undersöka
akut enterisk feber, den här gången med fokus på S. Typhi-infektion.
Plasmaprover från patienter i Bangladesh med en positiv blododling för S.
Typhi eller en negativ blododling men kliniskt misstänkt S. Typhi-infektion
jämfördes mot febrila kontroller. Metaboliter hittades som skiljde odlingspositiv
S. Typhi från febrila kontroller. I gruppen med misstänkt tyfoidfeber hade en
del metabolitmönster som liknade bekräftade tyfoidfallen medan andra hade
mönster liknade kontrollerna. Metabolitmönstret jämfördes mot resultaten i
den första studien och signifikanta metaboliter med samma riktning
(ökning/minskning i tyfoidgruppen jämfört med kontroll) hittades. En
valideringsstudie innehållande prover från patienter med bekräftad S. Typhi-
infektion samt kontroller med både malaria och infektion orsakade av andra
patogener analyserades också.
I den tredje studien låg fokus på kroniska bärare av Salmonellabakterier.
Plasmaprover från patienter i Nepal, som under en gallblåseoperation
bekräftades som bärare av S. Typhi eller S. Paratyphi A i gallblåsan, jämfördes
mot icke-bärare. Salmonella-bärare kunde skiljas från icke-bärare och tecken på
separation fanns även mellan bärare av S. Typhi och S. Paratyphi A. Vid en
jämförelse av metaboliter relaterade till kroniskt bärarskap och akut enterisk
feber (från den första studien) hittades metaboliter som endast var signifikanta
för kroniska bärare. Arbetet i denna avhandling har visat på potentialen i att
använda metabolomik som ett verktyg i sökandet efter diagnostiska markörer
för enterisk feber i olika sjukdomsstadier. Sammantaget kan dessa resultat ses
som ett steg på vägen mot förbättrade diagnosmetoder för enterisk feber.
v
List of Publications
This thesis is based on the following papers, referred to in the text by the Roman
numerals in bold font.
I. Näsström E, Nga TVT, Dongol S, Karkey A, Vinh PV, Thanh TH,
Johansson A, Aryal A, Dolecek C, Basnyat B, Baker S, Antti H.
Salmonella Typhi and Salmonella Paratyphi A elaborate distinct
systemic metabolite signatures during enteric fever. eLife, 3, e03100
(2014).
II. Näsström E, Parry CM, Nga TVT, Maude RR, de Jong HK, Fukushima
M, Rzhepishevska O, Marks F, Panzner U, Im J, Jeon H, Park S,
Chaudhury Z, Ghose A, Samad R, Van TT, Johansson A, Dondorp AM,
Thwaites GE, Faiz A, Antti H, Baker S. Reproducible diagnostic
metabolites in plasma from typhoid fever patients in Asia and Africa.
eLife 6, e15651 (2017).
III. Näsström E, Jonsson P, Johansson A, Dongol S, Karkey A, Basnyat B,
Nga TVT, Tan TV, Twaites GE, Antti H, Baker S. Metabolite biomarkers
of typhoid chronic carriage.
Submitted manuscript.
Papers I and II, published in eLife, are distributed under the terms of the
Creative Commons Attribution License.
vi
Author´s contributions
Summary of the contribution of Elin Näsström to the papers included in this
thesis:
I. Performed sample preparation for metabolomics analysis, analysis of
samples with GCxGC-TOFMS, data processing (including investigation
of methods to process GCxGC-TOFMS data), metabolite identification
and multivariate data analysis. Drafted parts of the initial manuscript
(including figures and tables) and revised the manuscript for
resubmission and publication.
II. Performed sample preparation for metabolomics analysis, analysis of
samples with GCxGC-TOFMS and UHPLC-Q-TOFMS, planning of
analysis of samples with GC-TOFMS, data processing (including
investigation of methods to process UHPLC-Q-TOFMS data),
metabolite identification and multivariate data analysis. Drafted parts
of the initial manuscript (including figures and tables) and revised the
manuscript for resubmission and publication.
III. Performed sample preparation for metabolomics analysis, analysis of
samples with GCxGC-TOFMS, data processing, metabolite
identification and multivariate data analysis. Drafted the manuscript
(including figures and tables) and revised the manuscript for
resubmission.
vii
Abbreviations and notations
Abbreviations
ANOVA Analysis of variance
AUC Area under the curve
CV Cross-validation
DA Discriminant analysis
EI Electron ionization
ESI Electrospray ionization
GC Gas chromatography
GC-TOFMS Gas chromatography time-of-flight mass spectrometry
GCxGC Comprehensive two-dimensional gas chromatography
GCxGC-TOFMS Comprehensive two-dimensional gas chromatography time-of-flight mass spectrometry
HMCR Hierarchical multivariate curve resolution
IS Internal standard
LC Liquid chromatography
LPS Lipopolysaccharide
m/z Mass-to-charge ratio
MDR Multidrug-resistant
MIC Minimum inhibitory concentrations
MPP Mass Profiler Professional
MS Mass spectrometry
MSTFA N-methyl-N-trimethylsilyl-trifluoroacetamide
NIPALS Non-iterative partial least squares
NMR Nuclear magnetic resonance
OPLS Orthogonal partial least squares
PAMP Pathogen-associated molecular pattern
PCA Principal component analysis
PCR Polymerase chain reaction
PLS Partial least squares
QC Quality control
RDT Rapid diagnostic test
RI Retention index
ROC Receiver operating characteristic
RT1 First-dimension retention time
RT2 Second-dimension retention time
SMC Swedish Metabolomics Centre
SPI Salmonella pathogenicity island
TLR Toll-like receptor
TMCS Trimethylchlorosilane
TOF Time-of-flight
viii
UHPLC-Q-TOFMS Ultra-high performance liquid chromatography quadrupole-time-of-flight mass spectrometry
WHO World Health Organization
Notations
The following notations have been used in this thesis. Matrices are denoted by
bold capital letters (e.g. X) and vectors are denoted by bold lower case letters
(e.g. t). Vectors are column vectors unless stated otherwise. Transposition is
denoted by a superscript T.
Matrix dimensions
A Number of components in model
K Number of columns in X
M Number of columns in Y
N Number of rows in X and Y
Matrices
B Matrix of regression coefficients for X, [KxM]
E Residual matrix of predictor variables, [NxK]
F Residual matrix of response variables, [NxM]
P Matrix of loading vectors for X, [KxA]
T Matrix of score vectors for X, [NxA]
U Matrix of score vectors for Y, [NxA]
X Matrix of descriptor variables, [NxK]
Y Matrix of response variables, [NxM]
Vectors
c Weight vector for Y, [Mx1]
p Loading vector for X, [Kx1]
t Score vector for X, [Nx1]
u Score vector for Y, [Nx1]
w Weight or covariance loading vector for X, [Kx1]
Other notations
o Orthogonal
p Predictive
1
1. Aims of this work
The overall hypothesis of this work has been that a combination of correlated
biomarkers in a marker pattern, a so-called latent biomarker, has a stronger
diagnostic potential as compared to single biomarkers. Here this hypothesis was
applied to diagnosis of the infectious disease enteric fever in human plasma,
with metabolites as the individual building blocks of the potential latent
biomarkers.
The overall aims of this work were to:
Evaluate the potential of plasma metabolite marker patterns for
diagnosis of enteric fever at different disease stages using mass
spectrometry-based metabolomics combined with chemometric
bioinformatics.
Suggest diagnostic biomarker patterns suitable for the development of
point-of-care diagnostic tests.
Provide clues for further investigations into the disease mechanisms of
enteric fever.
To achieve this, the specific aims have been to:
Investigate metabolite patterns
o for acute stage enteric fever compared to afebrile and febrile
controls (Papers I and II)
o for culture-positive and culture-negative S. Typhi infection
(Paper II)
o for S. Typhi and S. Paratyphi A enteric fever (Paper I and III)
o for chronic stage enteric fever compared to non-carrier controls
(Paper III)
Compare detected metabolite patterns
o during acute stage enteric fever across cohorts (Paper II)
o between acute and chronic stage enteric fever (Paper III)
2
2. Background
2.1. General description of the problem
Enteric fever is an infectious disease receiving relatively little attention in global
media. The disease is no longer a common disease in high-income countries but
constitutes a major problem in many low- and middle-income countries. In
2010, enteric fever was ranked as the 13th most common cause of years of life
lost in Southeast Asia (21th in sub-Saharan Africa) compared to around 100th
place in Western Europe and North America1. In Western Europe and North
America, diabetes and breast cancer have the same rank as enteric fever in
Southeast Asia1.
There is no animal reservoir for enteric fever and therefore transmission occurs
only among humans. The transmission is complicated by the presence of
chronic asymptomatic carriers, which forms a reservoir for the infecting
bacteria. Enteric fever arises mainly in areas with poor water quality and
sanitation. These two factors along with the availability of vaccines suggest that
there should be possibilities for regional elimination of the disease. However,
large efforts are needed in many areas to reach such a goal2. One area of great
importance is the improvement of diagnostic methods since current methods
suffer from major limitations3. Inaccurate diagnostic methods have
consequences for infected individuals, with increased risks of complications due
to inaccurate treatment. It is also of importance for estimations of the global
disease burden, which can aid in the decision-making regarding vaccination
programs and water/sanitation infrastructure investments. In addition, the
ability to detect chronic carriers would be relevant as a public health tool.
In this thesis, we have applied a mass spectrometry-based metabolomics
approach, including chemometric bioinformatics techniques for data analysis,
to investigate the metabolite patterns in plasma and urine from patients with
acute and chronic stages of enteric fever. We aimed to search for metabolite
patterns with the potential for use in the development of new diagnostic tests.
These approaches may not offer a simple and applicable solution to the
problem. Instead, this work should be considered as a contribution to the field
of improved diagnostic methods for enteric fever. This section aims to provide a
background to the work presented in the thesis including a description of
several aspects of enteric fever, diagnostic tests especially in resource-limited
areas, and finally an introduction to the field of metabolomics with specific
focus on infectious diseases.
3
2.2. Enteric fever
2.2.1. Causative bacteria
Enteric fever is a systemic bacterial infection caused by the Gram-negative
Salmonella enterica serovar Typhi (S. Typhi) and the Paratyphi serovars A, B
and C (S. Paratyphi A, B and C) of which S. Paratyphi A is most common4.
Enteric fever is a generic term for infections caused by both S. Typhi and S.
Paratyphi. Typhoid and paratyphoid fever refers to the infections caused by the
individual serovars. Throughout this work enteric fever will be mostly used, but
in cases focusing on S. Typhi infections typhoid fever will also be used.
S. Typhi has historically been the most common cause of enteric fever but
recently there have been several reports on the emergence of enteric fever
caused by S. Paratyphi A especially in Asia5–7. However, there is an uncertainty
in the true disease burden caused by S. Paratyphi A in many regions in Asia and
Africa8. S. Paratyphi A has been known to cause a milder disease compared to S.
Typhi4. This was questioned in a Nepali study where both seovars were shown to
cause similar clinical syndromes9. The thought that S. Paratyphi A causes a
milder disease has resulted in a limited focus on S. Paratyphi A, especially in
regards to vaccine development. There are however known differences between
S. Typhi and S. Paratyphi A, including different epidemiology and routes of
transmission. S. Typhi is associated with poor water quality and within-
household risks while S. Paratyphi A is associated with street food consumption
and risks outside the household10,11. The main microbiological difference
between S. Typhi and S. Paratyphi A is that S. Paratyphi A lacks the Vi-
polysaccharide capsule that is characteristic for S. Typhi12 and forms the basis of
one of the licensed typhoid vaccines. There are no licenced vaccines against S.
Paratyphi A4. There are also indications in some geographical areas that S.
Paratyphi A is more prone to develop antimicrobial resistance than S. Typhi9.
2.2.2. Disease burden
Enteric fever is most commonly occurring in countries with insufficient clean
water supply and poor hygiene and sanitation standards13. It is a common
disease in many countries in South and Southeast Asia, while the disease
burden in Africa has not been well described but is considered to be high14–16. In
high-income countries the occurrence of enteric fever decreased dramatically in
the first parts of the 1900s as a result of improved water and sanitary conditions
and is now mainly associated with travellers returning from endemic areas17.
The disease is most common among children and young adults4. A number of
estimates of the global burden of enteric fever has been made but these
estimates are often variable and uncertain due to limitations in current
diagnostic methods2,13. Estimates suggests that there were around 21 million
4
cases caused by S. Typhi and 5 million cases caused by S. Paratyphi A during
200018 and roughly 27 million cases caused by S. Typhi during 201019. A map
showing the global distribution of enteric fever is shown in Figure 1. The
average fatality rate for treated cases is estimated to less than 1% but there can
be large regional differences (with fatality rates up to 50% in parts of Indonesia
and Papua New Guinea)13. Even with this fairly low average fatality rate, enteric
fever remains a major problem in many parts of the world considering the high
number of estimated cases each year.
Figure 1. Map showing the global distribution of enteric fever with endemic and highly endemic countries highlighted. Illustration based on data from WHO.
2.2.3. Transmission and the carrier state
S. Typhi and S. Paratyphi A are human-restricted pathogens, thus there is no
animal reservoir for enteric fever12. The disease is transmitted through the
faecal-oral route, mainly via water and food contaminated by faeces containing
the Salmonella bacteria4. The type of source that is contaminated and the
amount of bacteria will have a large impact on the number of people that will
become infected as well as on the duration of the incubation period12.
Enteric fever has a documented carrier state where the infection is not entirely
cleared and bacteria can be shed through the faeces. The carrier state can be
divided into different categories depending on the duration of the carriage post
infection; convalescent: three weeks to three months, temporary: three months
to one year and chronic: over one year13,20. The chronic carriage state is present
in around 2-5% of the acute enteric fever cases and carriers are generally
asymptomatic21,22. In addition, up to 25% of the chronic carriers have no
documented history of typhoid fever13. One famous chronic carrier was Typhoid
Mary, working as a cook in New York in the early 1900s spreading typhoid fever
5
without having symptoms of the disease23. Like other pathogens with an ability
to multiply within infected hosts cells, S. Typhi can subvert the immune system
to survive undetected or without causing much symptoms of infection. Such
“stealth” features of S. Typhi (and S. Paratyphi A)24 allows persistent
colonization of the bone marrow and the gallbladder21,25. One of the strategies
used by the bacteria to colonize the gallbladder is to resist the antimicrobial
properties of the bile26. Another mechanism associated with gallbladder
colonization is the formation of biofilms on gallstones27. General risk factors for
chronic carriage are age and sex, with higher risk among females over the age of
4021. There are also correlations with gallbladder disease, such as cholecystitis28.
In addition, an increased risk of gallbladder cancer has been associated with
chronic Salmonella carriage (an observed/expected ratio of 167 for gallbladder
cancer has been reported for chronic carriers compared to acute typhoid fever
cases)29. The exact contribution of chronic carriers to the transmission of enteric
fever remains unclear. Studies have reported that chronic carriers might have a
more important role in the transmission of enteric fever in high-income
countries (often caused by single outbreaks) compared to endemic areas
(transmission from multiple sources)12,30,31. Alternatively, convalescent carriers
could be increasingly responsible for disease transmission in endemic areas10,32.
However, it is apparent that chronic carriers harbour and shed the bacteria.
Thus, it would be of great importance to be able to detect them, especially if the
long-term goal is to reach regional elimination20.
2.2.4. Pathogenesis and host-pathogen interactions
The human-restricted nature of S. Typhi and S. Paratyphi A has limited the
research of pathogenesis and host-pathogen interactions. There have been
several attempts to establish suitable animal models in which the disease
mimics enteric fever in humans with the most common being different murine
models of S. Typhimurium33. However, due to differences in the genome
between S. Typhimurium and S. Typhi34,35, caution should be taken when
interpreting results from the S. Typhimurium models. A promising discovery is
that mice lacking TLR-11 can be infected with S. Typhi, which could make it a
suitable animal model, although it is unclear how similar this infection is to
enteric fever in humans36,37. Another strategy is to use so-called human
challenge models, where healthy individuals are infected with S. Typhi38,39.
Typically, enteric infections can be divided into three main groups40. The first
two groups are classified as gastroenteritis; either non-inflammatory diarrhoea,
caused by Vibrio cholerae and different E. coli pathovars, or inflammatory
diarrhoea, caused by Shigella spp., Campylobacter jejuni, and the non-
typhoidal Salmonella40,41. The third group is classified as enteric fever and is
characterized by fever and abdominal pain, usually without diarrhoea, caused
by typhoidal Salmonella41. In this classification, two types of Salmonella
6
bacteria are included that cause different diseases, typhoidal (S. Typhi and S.
Paratyphi serovars) and non-typhoidal Salmonella (S. Typhimurium and S.
Enteritidis). This fact together with the use of S. Typhimurium in animal models
of enteric fever makes it interesting to compare the diseases caused by typhoidal
and non-typhoidal Salmonella. Some differences are listed in Table 1.
Typhoidal Salmonella infection has a number of unique features41,42: i) it is of
gastrointestinal origin but often lacks the characteristic signs of mucosal
inflammation and diarrhoea and ii) it is systemic but has a low number of
bacteria in the blood and is not typically associated with septic shock. Non-
typhoidal Salmonella usually causes a localized gastrointestinal inflammation
with diarrhoea in otherwise healthy patients. In patients with a weakened
immune defence bacteraemia is also a common feature and it is associated with
septic shock41.
Table 1. Differences between typhoidal and non-typhoidal Salmonella.
Typhoidal Salmonella Non-typhoidal Salmonella
Salmonella serovars S. Typhi, S. Paratyphi A, B, C S. Typhimurium, S. Enteritidis and many other
Host Human-restricted Wide range of hosts
Type of disease Enteric fever Gastroenteritis with inflammatory diarrhoea
Location Intestine and mesenteric lymph nodes and systemic spread to liver, spleen and bone marrow
Localized to intestine and mesenteric lymph nodes
Incubation time 7-14 days13 12-72 h
Disease duration Up to 3 weeks <10 days
Stool examination Mononuclear cells Neutrophils
Acute phase response Weak Strong
Associated with septic shock
Not usually Yes
Unique SPIsa (S. Typhi vs S. Typhimurium)35
SPI-7, 15, 17, 18 SPI-14
Vi capsule In S. Typhi and S. Paratyphi C Not present
Typhoid toxin43 In S. Typhi and S. Paratyphi A Not expressed
Information from Tsolis et al.41 and Raffatellu et al.44 unless stated otherwise. aSPI – Salmonella pathogenicity island.
Invasion of bacteria normally cause innate immune system detection of so-
called pathogen-associated molecular patterns (PAMPs) by the hosts toll-like
receptors (TLRs) which produces neutrophil-attracting chemokines and pro-
inflammatory cytokines45. The pro-inflammatory cytokines causes fever and
stimulates an acute phase response and if the acute phase response is too strong
it can lead to systemic-inflammatory response syndrome and sometimes septic
shock46. Patients with enteric fever have higher levels of these cytokines in
serum compared to healthy persons but much lower compared to patients
during septic shock44,47. The presence of the Vi capsule in S. Typhi (mentioned
7
in chapter 2.2.1. Causative bacteria), has been associated with the prevention of
recognition of one of the PAMPs, bacterial outer membrane lipopolysaccharide
(LPS), by host TLR-448. The Vi capsule is not present in S. Typhimurium and
might be one contributing factor to why the classical response with neutrophilia
and inflammatory diarrhoea is not initiated during S. Typhi infection. This
could be one of the stealth mechanisms S. Typhi uses to undergo detection of
the immune defence and cause the unique typhoidal disease. This can also
explain the longer incubation period compared to non-typhoidal Salmonella. It
should be highlighted that there must be other mechanisms and processes
unique to typhoidal Salmonella since the S. Paratyphi A and B serovars lack the
Vi capsule42. The Vi capsule will be discussed more in chapter 4.3. Paper III.
Another important feature related to the host-pathogen interactions during
enteric fever is the presence of the typhoid toxin, that is expressed specifically
by both S. Typhi and S. Paratyphi A43. The toxin induces many of the typical
enteric fever symptoms when administrated to mice, but more research is
needed to investigate the mechanisms of the toxin49.
Many of the observations regarding the mode of infection during enteric fever
have been made in the mouse models using S. Typhimurium, and therefore the
exact mechanisms in human enteric fever are unclear. Initially, the bacteria
enter the gastrointestinal tract, after ingestion of contaminated food or water,
and survive the gastric acid13. In the small intestine they adhere to and invade
the mucosal epithelial cells, preferably the microfold (M) cells on Peyer´s
patches (layer of lymphoid nodules)13,50. The bacteria translocate to the
lymphoid follicles, the mesenteric lymph nodes and to reticuloendothelial cells
in the spleen and liver via the blood stream (transient primary bacteraemia)13.
During this phase the bacteria encounter macrophages and become
phagocytosed, however they can form Salmonella-containing vacuoles and
survive and replicate within the macrophages51. After the incubation period (7-
14 days) the bacteria then re-enter the bloodstream, but in a low concentration
(<1 colony-forming unit/ml blood)52. When this secondary bacteraemia is
reached the clinical symptoms with fever and general discomfort usually start to
appear13. Secondary infections can also appear in the liver, spleen, bone marrow
and gallbladder.
2.2.5. Clinical features
The clinical features of enteric fever are variable and range from a mild disease
with few symptoms to a severe illness with multiple complications that can be
fatal12,53. There are also differences in the clinical features between children and
adults, as well as between children of different age groups54. The most common
feature is fever that is slowly rising and then becomes sustained. Due to the fact
than many people in endemic areas self-treat with antimicrobials prior to
8
seeking care, this classic fever-pattern is not always present55,56. Other common
features are headache, general discomfort (malaise), loss of appetite (anorexia),
dry cough, muscle pain (myalgia), abdominal discomfort, coated tongue, and
hepatomegaly (enlarged liver) and/or splenomegaly (enlarged spleen)13,53,57,58.
The variety of clinical features is further highlighted by that fact that either
constipation or diarrhoea can occur (diarrhoea is more common among young
children)59,60. A relative bradycardia (slower heart rate) is also associated with
enteric fever (compared to tachycardia in sepsis) although it is not common
everywhere13. Rose spots, small lesions usually on the chest, are considered to
be specific for enteric fever however the frequency is varying between 5-30%13.
Complications can occur in 10-15% of the cases and the most common are
gastrointestinal bleeding (is usually self-limiting but can be fatal), intestinal
perforation (is considered the most severe complication) and typhoid
encephalopathy (has often high mortality)13.
2.2.6. Treatment and antimicrobial resistance
Before enteric fever could be treated with antimicrobials, fatality rates of up to
26% were reported56. Even today young children, elderly and immuno-
compromised patients are at high risk for severe disease56,61 with higher risks
during infections by antimicrobial-resistant strains. With appropriate
treatment, fatality rates decreases dramatically which indicates the importance
of correct and timely treatment. The successful treatment results in the
prevention of severe disease and preferably complete eradication of the bacteria
to decrease the risk of relapse and transmission4. The risk of relapse can vary
between 5-20% and it often occurs with milder symptoms12. Most cases of
uncomplicated enteric fever can be treated at home with antimicrobials, but for
more complicated cases hospital care is needed with close monitoring for
detection and management of complications12,13.
Chronic carriers of S. Typhi or S. Paratyphi A should be treated with
antimicrobials for up to 28 days and if gallstones are present a combination of
antimicrobials and surgery to remove the gallbladder (cholecystectomy) is
usually needed4. In some cases the carriage state can remain even after removal
of the gallbladder20. It can be discussed if treatment of all chronic carriers with
antimicrobials can be motivated, however the contribution to disease
transmission and the increased risk of gallbladder cancer highlights the
importance of monitoring this group.
The choice of antimicrobial is crucial for the treatment and it is of major
importance to have information about the distribution of antimicrobial
sensitivity for the S. Typhi and S. Paratyphi A strains in the current area62.
Other factors to take into account are the effectiveness, cost of and accessibility
to the antimicrobials under consideration12. As for many other bacterial
9
infections, antimicrobial resistance is a major problem for enteric fever.
Epidemics caused by S. Typhi strains resistant to the first antimicrobial used for
treating enteric fever, chloramphenicol, was reported in the early 1970s with
outbreaks in many parts of the world63–65. During the late 1980s and the 1990s
there was a large number of reports of strains exhibiting resistance to the
traditional first-line antimicrobials, chloramphenicol, ampicillin and
trimethoprim-sulfamethoxazole66–69. These strains were named multidrug-
resistant (MDR), and lead to great difficulties in the treatment of enteric fever70.
Due to the increase of MDR strains, fluoroquinolones were recommended and
widely used71–74. Advantages with fluoroquinolones include high efficiency even
at short courses of 3-5 days, rapid fever clearance and low rates of relapse and
faecal carriage, especially when compared to the first-line antimicrobials12,13.
The fluoroquinolones are often well tolerated. There were previous concerns
regarding the treatment of children but short courses were proven suitable75,76.
However, S. Typhi and S. Paratyphi A strains with decreased susceptibility and
also resistance to fluoroquinolones have emerged in multiple countries77–80.
Treatment recommendations for uncomplicated enteric fever from the World
Health Organization (WHO) states that fluoroquinolones should be the first
choice in areas where there is no quinolone resistance12. In areas with quinolone
resistance, azithromycin or the third-generation cephalosporin, ceftriaxone
should be used with cefixime as alternative12. Recommendations are similar for
severe enteric fever but with a general increase in treatment duration.
Azithromycin is considered safe and of high efficacy81. It has however been
noted a “back-to-the-wall-option” and even though azithromycin-resistant
strains are currently rare they may become more common if this drug becomes
empirical82. Ceftriaxone has limitations due to the route of administration
(parenteral) and the relative high cost83. The oral cefixime has been associated
with a high rates of treatment failure84.
Two recent studies from Nepal demonstrated increased minimum inhibitory
concentrations (MICs) for fluoroquinolones over time which resulted in longer
clearance times and worse outcome83 and problems in the efficacy of the fourth-
generation fluoroquinolone gatifloxacin due to fluoroquinolone-resistant
strains85. These studies stated that fluoroquinolones should no longer be
recommended as the first choice treatment. Together with a commentary82 they
highlight the major problems associated with the treatment of enteric fever due
to antimicrobial resistance, which in the worst of cases could lead to fatality
rates more similar to the pre-antimicrobial area. Again, this can be related to
the insufficiency of current diagnostic methods that can lead to both
inappropriate, ineffective and sometimes unnecessary antimicrobial treatment
which postpones the administration of correct treatment and subsequently
prolongs the illness and increases the risk of complications2.
10
2.2.7. Prevention and vaccination
There are five main areas important for the prevention and control of enteric
fever, according to the WHO: water safety, food safety, sanitation, health
education and vaccination12. For water safety both improved water quality and
quantity is of importance12. It has been suggested that efforts to improve the
water quality is of greater value compared to the improvement of sanitation
when it comes to enteric infections86. Health education is an area that cannot be
neglected since the spread of information about disease transmission and the
importance of hygiene can lead to changes in people’s behaviour12. The
successful major reduction of enteric fever in most high-income countries due to
improvements in water quality and sanitation has shown the value of these
strategies87. Most beneficial for resource-limited areas, where enteric fever is
common, would be a general improvement of the living standards. This would
dramatically reduce the occurrence of enteric fever and many similar diseases.
The work towards such improvements should be made in combination with
other preventative measures such as vaccination.
The first vaccine against S. Typhi was a killed whole-cell vaccine, but this type is
no longer in use due to too many severe reactions towards the vaccine88. There
are two licensed vaccines currently in use for S. Typhi infections, a single dose
parenteral Vi polysaccharide vaccine and a live attenuated oral vaccine called
Ty21a (usually three doses)88. Both vaccines are safe and usually well tolerated,
but they have relatively short duration and they are not licensed to use in
children <2 years88,89. An oral route of administration (as for Ty21a) instead of
parenteral (as for Vi) is easier to administrate and usually more accepted90.
There is a large number of novel vaccines under development and the most
promising are the typhoid conjugate vaccines that should be possible to
administrate to young children, last longer and have higher efficacy15,91. Another
important area is the development of a S. Paratyphi A vaccine since there is
currently no licensed vaccine available92. In a position paper, the WHO
recommends that “countries should consider the programmatic use of typhoid
vaccines for controlling endemic disease” and offers suggestions on how
vaccination should be considered for parts of endemic populations at increased
risk, children in specific areas, during outbreaks and for travellers to endemic
areas88.
Improved diagnostic methods are of great importance for enteric fever
prevention, since accurate assessments of disease burden in specific areas will
assist in decision-making regarding vaccination programs and water/sanitation
improvement measures2. In addition, the ability to detect carriers is an
important strategy to decrease the spread of enteric fever62.
11
2.3. Diagnosis
The diagnosis of enteric fever is still primarily clinical. The most reliable
laboratory test is repeated blood cultures or culture of bone marrow aspirates.
These culture tests are complicated, time consuming, and require access to a
high-quality microbiological laboratory, which is often lacking in locations
where the disease is most common.
2.3.1. Diagnostics in resource-limited areas
The highest disease burden for enteric fever and thus the greatest need for
accurate diagnosis is in areas with limited resources. This fact adds substantial
complexity to the area of diagnostics. For many of the infectious diseases
common in low- and middle-income countries there are possibilities for
treatment. However, the lack of adequate diagnostic tests leads to both under-
treatment and over-treatment93,94. Under-treatment can result in disease
complications and prolonged illness. Over-treatment can lead to missed
alternative diagnoses and to a shortage in the supply of common treatments.
Most importantly, over-treatment could increase the development of
antimicrobial-resistant pathogens.
The WHO has summarized the characteristics of what would be an ideal
diagnostic test in resource-limited areas using the acronym ASSURED. This test
should be affordable, sensitive, specific, user-friendly, rapid, equipment-free
and delivered to those who need it95. To have a sufficiently accurate test that is
also suitable for use in limited infrastructure settings is usually problematic96. A
few infrastructural resources that should be considered are electricity, clean
water supply, cold storage possibilities, laboratory space and the need for
trained personnel96. In addition to accuracy and equipment requirements, a
crucial characteristic of a diagnostic test is the speed (i.e. time to result). To
properly guide the treatment a rapid diagnostic result is crucial. In many
primary healthcare settings it is noted that patients do not always return for the
result of a test that takes long time and therefore risk not receiving appropriate
treatment95. The type of sample used and the volume needed to obtain accurate
diagnosis is also of importance since the sample collection method will affect the
suitability for use in resource-limited settings93. Blood, urine, faeces and saliva
are commonly used sample types, but it should be noted that there can be
negative cultural attitudes towards blood samples is certain areas93.
The development of new diagnostic tests can be divided into several steps and
include identification of suitable diagnostic targets/biomarkers, the conversion
into a suitable test format and high-quality clinical validation of the test.
Scientific and technological advances has led to an increased understanding of
12
host-pathogen interactions during infection and new techniques including the
omics approaches may be used to identify diagnostic targets/biomarkers
originating from both the pathogen and the host97. Advances in areas such as
microfluidics and nanotechnology are also promising when it comes to
converting the discovered target/biomarkers into simple and reliable diagnostic
tests98.
Apart from the scientific and technical issues, another complicating factor for
the development of diagnostic tests for use in resource-limited areas is the lack
of market-based funding but there is a rise in initiatives through public sector
investments95. Other factors to consider during the development of a diagnostic
test are cultural and social aspects in the specific areas and also safe disposal of
waste products93. Once a test has been developed and validated a final barrier is
to make sure it is used, especially in cases where treatment is easily available
and cheap94. One positive example of this is the distribution of rapid diagnostic
tests for malaria in retails such as drug shops99. This demonstrates that new
modern tests can be implemented in resource-limited areas of the world.
2.3.2. Enteric fever diagnosis
Parry et al. proposed the need for three types of diagnostic tests for enteric fever
with different purposes and characteristics2. First, there is a need for a rapid,
simple, sensitive and specific test for acute disease. This would provide more
appropriate care which would be beneficial for several reasons. The risk for
complications would decrease along with shortened disease duration and a
decreased risk of transmitting the bacteria. In addition, a more rational use of
antimicrobials might lead to a decreased development of antimicrobial
resistance. Second, a reliable test for the detection of chronic carriers is needed
to decrease transmission. Finally, a test for both acute and convalescent patients
is needed to make accurate estimations of the disease burden. Such information
can guide decisions regarding vaccination programs and other preventative
strategies. Focusing both on acute case detection and disease surveillance is in
line with the recommendations for diagnostic test development95. However,
diagnostic methods for related febrile illnesses also need to be improved in the
work towards improved diagnosis of enteric fever94.
In recent years there have been several reviews focusing on current diagnostic
methods for enteric fever and novel methods under development2,3,94,100. These
reviews all highlight the importance of improved diagnostic methods. Some
factors contributing to the difficulties of enteric fever diagnosis are: the rather
long incubation period, the varying and unspecific clinical features, the low
numbers of bacteria (especially in blood) and the cross-reactivity of antigens3. A
brief overview of current and novel methods for diagnosis of enteric fever will be
given in the text below and in Table 2.
13
Table 2. Summary of current diagnostic methods of enteric fever.
Method Description Advantages Disadvantages
Acute enteric fever Microbiological culture
Isolation of S. Typhi or S. Paratyphi A
Definite diagnosis, use for antimicrobial susceptibility testing
Long time to result, require microbio-locial laboratories
Blood culture Easier collection vs. bone marrow
Inaccurate sensitivity
Bone marrow culture
Improved sensitivity vs. blood
Invasive and techni-cally challenging
Serology Widal test Agglutinating
antibodies against LPS, flagella
Rapid, simple, low laboratory requirements
Inaccurate sensitivity and specificity, ideally need paired samples
Other RDTs Antibody detection against LPS, outer membrane protein etc.
Rapid, simple, low laboratory require-ments, improved accuracy vs. Widal
Still inaccurate sensitivity and specificity
Nucleic acid amplification
Detection of specific gene sequences through PCR
Amplification of low number of bacteria, detect unculturable bacteria, rapid
Variable sensitivity in culture neg., high laboratory requirements
Chronic enteric fever Bile detection Isolation of S. Typhi or
S. Paratyphi from bile or gallstones
Definite diagnosis Highly invasive
Serial stool samples Isolation in several samples due to intermittent shedding
Definite diagnosis Inaccurate sensitivity, logistic issues
Vi antigen Detect antibody response to Vi antigen
More practical vs. bile detection and serial stool samples
High background levels in endemic areas
2.3.2.1. Current diagnostic methods for acute enteric fever
The symptoms and other clinical features of enteric fever are unspecific and
therefore it is usually difficult to make a reliable clinical diagnosis. It has to be
differentiated from other febrile illnesses including dengue fever, malaria,
rickettsial infections, leptospirosis, tuberculosis, brucellosis and sepsis due to
other pathogens2,4,101. Attempts have been made to combine clinical signs and
laboratory measurements to make diagnostic predictions, but so far with limited
success94. For a definitive diagnosis of enteric fever the bacteria must be
isolated, usually from blood or bone marrow12. The isolated bacteria can then be
used for antimicrobial susceptibility testing. Blood culture is considered the gold
standard method but comes with several limitations. The sensitivity is widely
varying, detecting 40% to 80% of the presumptive cases25,52,102. Factors affecting
the sensitivity are prior antimicrobial treatment, blood collection volume and
time13. Bone marrow culture usually has a higher sensitivity compared to blood
culture, mainly due to a higher number of organisms in bone marrow but also
because of smaller negative impact of prior antimicrobial treatment25,52.
14
However, the main limitations with bone marrow culture are the invasiveness
and technical equipment needed which makes it rarely used. Overall, the
limitations of microbiological culture are the need for substantial laboratory
equipment, which is lacking in many endemic countries103, and the long time to
result (which can range from 1-5 days for blood culture)94. The limitations of
blood culture also affect the evaluation of new diagnostic methods since it is
widely used as the reference method.
Among the serological methods, the Widal test is the most widely known where
agglutinating antibodies, against the S. Typhi or S. Paratyphi A antigens LPS or
flagella, are measured in serum. Due to its simplicity, speed and minor
equipment needed it is still widely used in several areas104,105. However, there
are major problems associated with the Widal test, including inadequate
sensitivity (affected by prior antimicrobial treatment) and specificity (due to
antigenic cross-reactions), inaccurate interpretation, misuse and the need for
paired samples (to compare acute and convalescent phases)94,104. More recent
serological rapid diagnostic tests (RDTs) have been developed with the desired
properties of simplicity and speed for point-of-care use. Most of these tests
detect antibodies using different methods. See review by Andrews and Ryan for
detailed information94. A recent database review over studies evaluating RDTs
concludes that they have moderate accuracy but also that many of the studies
themselves are somewhat biased106. The RDTs may show increased sensitivity
and specificity compared to the Widal test but they are still not sensitive and
specific enough to be routinely used2. Nucleic acid amplification methods
through polymerase chain reactions (PCR) (both conventional and real-time
PCR) have been examined. They have the potential of amplifying small numbers
of bacteria and also detect bacteria that cannot be cultured2. Another advantage
is the increased speed compared to blood culture. The PCR-based methods
usually have high sensitivity for blood-culture-positive samples but lower for
blood-culture-negative samples107,108. Limitations to the use of PCR methods in
low-resource settings includes the need for specialized laboratory equipment,
relatively large blood volumes, and the relatively high cost2.
2.3.2.2. Current methods for carrier detection
The current gold standard method for diagnosing chronic Salmonella carriers is
detection of Salmonella bacteria in bile or on gallstones during surgery to
remove the gallbladder. However, this is a highly invasive method and can be
applied only to a restricted patient group. Thus there is a need for other less
invasive methods20. Another approach has been to collect serial stool samples
but this lacks in sensitivity and has logistic issues109. The detection of the Vi
antigen has been suggested as an option for carriage detection but this method
might have limited use in endemic areas due to high background levels of anti-
Vi antibodies110.
15
2.3.2.3. Novel diagnostic approaches for enteric fever
Several diagnostic methods for enteric fever are under development. They can
be divided into two categories; improvements and further developments of
current methods and novel approaches. A few examples in the first category are
improved culturing (improved growth conditions, electricity-free incubation,
improved amplification methods (enrichment steps to improve sensitivity,
isothermal procedures to simplify instrumentation, combined culture and PCR
for higher speed) and improved serological methods3,94,100. The second category
include amplification-free molecular methods (DNA capture probes),
immunoscreening to detect novel antigens and antibodies and detection of host
factors apart from antibodies (including the omics techniques transcriptomics,
proteomics and as further described and discusses below, metabolomics)3,94,100.
16
2.4. Metabolomics
In this section, the research field of metabolomics will be covered including
applications with special emphasis on biomarker detection and infectious
disease. Method details and technical aspects will be described and discussed in
chapter 3. Methods.
2.4.1. General metabolomics
Metabolomics is the complete and quantitative analysis of metabolites in a
biological system during environmental or physiological stimuli. Metabolites are
defined as low molecular weight molecules including a wide range of compound
classes, such as sugars, amino acids, fatty acids, nucleosides and organic acids.
Together constituting the metabolome. Metabolomics, as we know it today, has
since the early 2000s been described in accordance with the definitions of
metabolomics or metabonomics by Fiehn and Nicholson111,112. However,
approaches very similar to metabolomics were used already in the 1960s113.
Metabolomics is the most recent part of the so-called omics cascade also
including genomics, transcriptomics and proteomics. Since metabolites are the
end products of this omics cascade the metabolome is considered to be closest
to the function or phenotype114, being the result of the combination of genetic
and environmental factors115. The metabolome is extremely dynamic suggesting
that any changes occurring to the studied system would be reflected instantly at
the metabolite level providing a snapshot of the status of the system116.
Due to the wide variety of molecules classified as metabolites and the large
concentration differences between them it is impossible to obtain a complete
metabolite coverage using only one analytical technique. Therefore, a
compromise in terms of coverage must be made when selecting a single method.
Alternatively, the combination of several methods can be used to increase the
coverage117. Two main types of analytical techniques are commonly used in
metabolomics, mass spectrometry (MS)-based techniques and nuclear magnetic
resonance (NMR). The MS-based techniques are often used in combination with
a chromatographic step for separation of metabolites prior to MS detection.
Advantages with the MS-based techniques compared to NMR are the higher
sensitivity, the possibility of using smaller sample volumes and the lower
instrument costs116. Advantages of NMR includes the non-destructive nature of
the method and the simple sample preparation118. There are two main analytical
approaches within metabolomics: non-targeted and targeted analysis116. In non-
targeted or global metabolomics a wide range of metabolites are analysed to
gain information about general differences in the metabolome in relation to a
specific condition. In targeted metabolomics only specific pre-selected
metabolites are measured and quantified, for example a specific metabolite class
17
or a set of metabolites involved in a biochemical pathway of interest. There are
advantages with both approaches and they can be used complementary115,116.
First, a non-targeted approach can be applied resulting in a broad metabolite
coverage, although with a compromise in terms of metabolite detection and no
absolute quantification. Second, a targeted approach can be applied based on
the results of the non-targeted analysis to gain absolute quantification of
specific metabolites of interest. Metabolomics analysis results in very complex
data, especially for the non-targeted approach. Therefore, sophisticated data
processing and analysis methods are needed to get the data into a representative
format and to extract the relevant information from it. For this, methods based
on multivariate data analysis, able to simultaneously handle, model and
investigate large number of metabolites, are commonly used. For more
information see chapters 3.4. Processing of metabolomics data and 3.5.
Multivariate data analysis. A number of recent reviews summarizes several
aspects of the metabolomics field including recent advancements115,119,120.
2.4.2. Biomarker detection using metabolomics
A biomarker can be defined as “a characteristic that is objectively measured and
evaluated as an indicator of normal biological processes, pathogenic processes,
or pharmacologic responses to a therapeutic intervention”121. There are many
different applications of biomarkers in human disease. The characteristics of the
disease and the clinical setting determine what type of biomarker/information
that is most relevant. Biomarkers can for instance be used for disease diagnosis
and prognosis, treatment selection and treatment evaluation121,122. There is also
a growing interest in biomarkers for personalized or precision medicine123.
The likelihood of finding a single biomarker specific enough for a particular
disease is low. With the exception of diseases classified as inborn errors of
metabolism, more than one biomarker (metabolite) is commonly needed to
reach high specificity124. The concept of using a combination of biomarkers is
not new, instead it is commonly used clinically when combining different
physiological parameters and observations to obtain the correct diagnosis.
Metabolic biomarker panels have another advantage apart from just combining
several metabolites. It can also take advantage of the correlation between the
included metabolites to increase predictivity and specificity. Using multivariate
data analysis methods (described in chapter 3.5. Multivariate data analysis)
these correlation structures can be used to find a panel of biomarkers, a so-
called latent biomarker to facilitate prediction and interpretation.
In a review by Mamas et al. three questions related to the usefulness of
metabolomics in the search for biomarkers are addressed125. The first question:
Can the clinician measure the biomarkers? – Many metabolites are already
being commonly measured in clinical settings. Well-known examples are
18
glucose and cholesterol. However, these examples are often single-metabolite
measurements. With the goal of using a biomarker panel, the ideal situation
would be to get a simultaneous measurement of several metabolites. This could
be achieved by developing specific targeted analysis methods e.g. using mass
spectrometry. However, to design more clinically feasible “bench-side” devices
for biomarker panel measurement further developmental work will be required.
The second question: Do the biomarkers add new information? – New
information can be provided applying the hypothesis that metabolites specific
for a disease condition can be secreted into the blood or other biofluids and then
be detected122. Using blood as the sample matrix can be considered to give a
more general picture of the biochemical status in the system126. This would be
beneficial especially for diseases with a complex pathophysiology that affects
many different systems. A great advantage of metabolite biomarkers is the
possibility of gaining information about disease mechanisms during the
discovery process. Another benefit is that the metabolome changes rapidly over
time which makes it possible to study disease progression and treatment
response (due to this, variation unrelated to the question will also be detected).
To be useful the biomarkers need to be validated in several studies from
multiple locations. The third question: Do the biomarkers help the clinician to
manage patients? – As mentioned previously there are many clinical uses for
biomarkers including diagnosis, treatment selection and treatment response.
Since many biomarkers can be used for these purposes in the clinic today it
should be possible to use metabolomics-based biomarkers in the future.
However, rigorous validation of the detected markers is needed before they can
be introduced to the clinic which will result in an increased development time.
2.4.3. Metabolomics in infectious diseases
In infectious diseases, there are a number of problems that metabolomics might
help to address, with improved diagnostics being one of main importance. More
specific, it would be of major impact if specific biomarker patterns could be
detected for individual infections or groups of related infections with similar
disease management. Another area of high importance is the detection of
biomarker patterns related to antimicrobial resistance to guide treatment and
monitor treatment responses. Finally, metabolomics could be used as a tool for
increased understanding of disease mechanisms and host-pathogen interactions
during disease. When using metabolomics to study infectious diseases there
may be a risk of detecting metabolites changing as a response of the general
infection and thus being too unspecific for a diagnostic test. However, it should
be possible to detect changes associated with host responses more specific to a
certain infection. Apart from the host response to infection there is also a
possibility of detecting metabolites originating from the pathogens themselves.
A combination of host- and pathogen-related metabolites would be optimal for
specific diagnosis.
19
Metabolomics has been used to study various infectious diseases. A few
examples will be covered here with focus on the different applications
mentioned above and on infections somewhat related to enteric fever. Malaria
and dengue fever are infections presenting many similar clinical features to
enteric fever. In two separate studies metabolite patterns were analysed in
human serum samples from patients with and without dengue fever and during
different disease stages127,128. These studies aimed to find metabolites with
diagnostic and prognostic values and to get an increased understanding of the
disease mechanism. Metabolomics has been applied to study malaria infections
several studies. Examples include the investigation of i) metabolites in plasma
samples from young children comparing controls and patients with mild and
severe malaria129, ii) metabolites in urine samples comparing patients with
malaria infection and non-malarial fever to healthy controls130 and iii)
metabolites in urine samples from mice with malaria infection with and without
concurrent helminth infection131. Metabolomics has also been used to study
sepsis. Examples include metabolomics on human plasma or serum for
comparing i) children with septic shock or systemic inflammatory response to
healthy controls to obtain diagnostic and prognostic metabolites132, ii) patients
surviving sepsis infection to patients that did not with the aim to find a
combination of metabolites and clinical features that could be used for
predictions of sepsis outcome133 and iii) healthy volunteers either given an
endotoxin or placebo using multiple time points (often used as a model to
investigate systemic inflammatory response)134. Interestingly, the results of the
two studies in ii) and iii) above were compared and showed that similar
responses were seen during sepsis and in the endotoxin challenge135. The studies
mentioned here highlight the potential of using metabolomics for the study of
infectious diseases.
2.4.4. Omics studies related to enteric fever
There are other metabolomics studies related to enteric fever, apart from the
studies presented in this thesis. However, none of them focuses on diagnosis or
includes human samples. Two studies analyses metabolites during growth of
Salmonella Typhimurium. The first study investigates the mechanisms of
growth in bile while the second study the effects of metabolites on virulence
expression136,137. In another study the impact of S. Typhimurium infection on
host metabolism was investigated in a murine typhoid model138. For other omics
techniques, human samples of typhoid fever have been analysed with diagnostic
focus. In a transcriptomics study, peripheral blood of patients with S. Typhi
infection was analysed during the acute, convalescent and recovery phases139. In
addition a proteomics study investigated plasma proteins in chronic typhoid
carriers140. The lack of studies using metabolomics with focus on diagnosis of
enteric fever emphasizes the need for the studies included in this work.
20
3. Methods
Metabolomics involve many different techniques and procedures on the route
from sample collection to interpretable results. An overview of the
metabolomics workflow is shown in Figure 2 and will be described and
discussed below.
3.1. Sample matrices
The main sample matrix in this work has been human plasma. However, in
Paper II human urine samples were also analysed.
3.1.1. Plasma/serum
Blood sampling is widely used for diagnostic purposes in the health care system.
Plasma, serum or whole blood are commonly used in metabolomics studies
since blood is easy accessible and routinely collected. Blood can be divided into
a cellular part and a fluid part. The cellular part includes erythrocytes (red blood
cells), leukocytes (white blood cells) and platelets. The fluid part, called plasma,
contains a large proportion of water (up to 95%) but also include proteins,
Figure 2. Overview of the metabolomics workflow.
21
metabolites and electrolytes141. If the plasma is allowed to clot and the clotting
proteins are removed the remaining fluid is called serum. Apart from
metabolites related to blood clotting the metabolites of plasma and serum
should be comparable. However, several studies have reported differences in
metabolite patterns between plasma and serum142–144. In this work plasma has
been the biofluid of choice which facilitates comparison between the studies.
See Papers I-III for more information about the plasma collection procedures.
3.1.2. Urine
Urine contains mainly water-soluble waste products excreted from the kidneys
but its role as a diagnostic biofluid is still of major importance. Comparing the
human urine and serum metabolomes145,146 reveals that a large number of urine
metabolites were not reported to be found in blood, although theoretically all
urinary metabolites should be present in blood. This could be related to the
concentration differences of metabolites in the two biofluids but also due to
analytical differences145. Advantages with the use of urine in metabolomics are
the non-invasive nature of the sample collection, the lower protein content
facilitating metabolite extraction, the large diversity of metabolites and the lack
of volume-limitation145,147. However there are also disadvantages such as large
dynamic range (also a challenge for plasma) and highly varying pH, ionic
strength and osmolarity that complicates the analytical measurements148. See
Paper II for more information about the urine collection procedures.
3.2. Sample preparation for metabolomics analysis
Sample preparation, by extraction of metabolites from the sample
matrix/biofluid in use, is an important step in metabolomics analyses. The
sample matrix/biofluid and also the analytical technique to be used will affect
the choice of sample preparations. In this work, two combinations of
biofluid/analytical technique have been used, human plasma/GC(xGC)-TOFMS
and human urine/UHPLC-Q-TOFMS.
3.2.1. Metabolite extraction
To analyse metabolite levels in biofluids the cells have to be disrupted to release
their content. Molecules of high molecular weight (proteins) need to be removed
while keeping the low-weight metabolites. Since the goal of untargeted
metabolomics is to detect as many metabolites as possible (preferably using one
single method) and the fact that the metabolites are of high chemical diversity,
the extraction method must provide a good compromise to allow the extraction
of many metabolites.
22
3.2.1.1. Plasma extraction
The protocol for extraction and derivatization of human plasma used in this
work was developed by A et. al149 and forms the basis for the sample preparation
protocol for plasma/serum samples at the Swedish Metabolomics Centre (SMC)
(http://www.swedishmetabolomicscentre.se). Briefly, the metabolites were
extracted using a methanol/water extraction mix (8:1 v/v) including isotopically
labelled internal standards (IS) in a bead mill. After time on ice the samples
were centrifuged and the supernatants evaporated to dryness. Starting volumes
of 50 or 100 µl were used in the different studies depending on original sample
amounts available.
3.2.1.2. Urine extraction
For extraction of metabolites from urine (in Paper II) a modified version of the
protocol described by Want et al.148 was used. Compared to the extraction
procedure for plasma the extraction of metabolites from urine is simpler and
often faster with only two main steps including centrifugation and water
extraction. 100 µl urine was used as starting volume and the samples were
centrifuged followed by addition of 100 µl water (including two isotopically
labelled IS) to 50 µl supernatant.
3.2.2. Derivatization for GC-TOFMS and GCxGC-TOFMS analysis
To be able to run gas chromatography the compounds need to be volatile and
thermo-stable, which is not the case for many metabolites with more polar
properties. Therefore a chemical derivatization process is required. The most
common derivatization method is silylation where the hydrogens in -OH, -NH
and -SH groups of the metabolites are replaced by silyl groups150. There are
several reagents that can be used for silylation of metabolites and a common
reagent in metabolomics analysis is MSTFA (N-methyl-N-trimethylsilyl-
trifluoroacetamide) in combination with TMCS (Trimethylchlorosilane). In
addition to the silylation step there is usually a need for an oximation step to
stabilize carbonyl groups in aldehydes and ketones to prevent the formation of
enols151. The oximation step also prevents cyclization of sugars which decreases
the number of derivatization products for each sugar152. However, even with a
suitable choice of derivatization reagents there is still risk of bias due to artefact
formation, incomplete derivatization and formation of several derivatization
products. The same derivatization process (described in detail in Paper I) has
been used in all papers and includes methoxymation with O-
methylhydroxylamine hydrochloride in pyridine (16 h reaction time) and
silylation with MSTFA + 1% TMCS (1 h reaction time).
23
3.3. Metabolite analysis
Three different analytical techniques have been used for metabolite analysis in
this work, all with a chromatographic method for metabolite separation coupled
to mass spectrometry for detection. The following techniques were used: one-
dimensional gas chromatography time-of-flight mass spectrometry (GC-
TOFMS), comprehensive two-dimensional gas chromatography time-of-flight
mass spectrometry (GCxGC-TOFMS) and ultra-performance liquid
chromatography quadrupole-time-of-flight mass spectrometry (UHPLC-Q-
TOFMS). An overview of the analytical methods for all papers/cohorts is shown
in Table 3.
Table 3. Overview of analytical techniques and processing methods used in the papers/cohorts.
Analytical method Data processing method Paper/cohort
GCxGC-TOFMS ChromaTOF/Guineu Paper I – Nepali cohort
2D HMCR Paper II – Bangladeshi primary
cohort, plasma
2D HMCR Paper III – Nepali cohort
GC-TOFMS Target GC-MS Paper II – Bangladeshi/Senegalese
validation cohort
UHPLC-Q-TOFMS Mass Hunter Qual/MPP Paper II – Bangladeshi primary
cohort, urine
3.3.1. Chromatography
The main principle of chromatography is separation of compounds in a mixture
through different degree of interaction between a mobile phase (containing the
sample) and a stationary phase. As the compounds are separated due to
differences in chemical properties they are registered in a detector at the time of
elution from the chromatographic system.
3.3.1.1. One-dimensional gas chromatography
In one-dimensional gas chromatography (GC), the sample, containing
derivatized compounds (metabolites), is made volatile through heated injection.
Once the sample is injected it is transported via a carrier gas (He, H2 or N2) as
the mobile phase to a column, usually a fused-silica capillary column, coated
with a stationary phase. Due to the different boiling points and polarities of the
metabolites, the interaction between the compounds in the mobile phase and
the stationary phase will be of different strengths and therefore provide a
separation of the metabolites. A temperature program for the column oven, with
stepwise temperature increase, is suitable for efficient separation. The
compounds eluting from the column are transferred to a detector. Examples of
detectors used in combination with GC are mass spectrometers, flame ionization
24
detectors and thermal conductivity detectors, but only the MS technique will be
described here.153
3.3.1.2. Comprehensive two-dimensional gas chromatography
The general principle of comprehensive two-dimensional gas chromatography
(GCxGC) is similar to one-dimensional GC. The main difference is the addition
of a second column. The use of two separate columns with different properties
will increase the separation and resolution compared to GC and will therefore
decrease the problem with co-eluting peaks154. The most common column setup
is to use a nonpolar first-dimension column and a shorter mid-polarity second-
dimension column155. However, it has been suggested that a polar first-
dimension column and a nonpolar second-dimension column would improve
the use of the second-dimension space156. Another important feature of GCxGC
is the use of a thermal modulator between the two columns that traps the
eluting metabolites from the first dimension and releases them to the second
column in segments regulated by hot pulses with cold times in-between157. The
result of the modulator is that narrow peaks for each metabolite will be created
in the second dimension which increases the sensitivity of the analysis158.
3.3.1.3. Liquid chromatography
In liquid chromatography (LC), a liquid mobile phase transports the sample
through the column, usually a packed column with a bound stationary phase
attached to a silica surface. There are different methods for LC separation, with
the most common method being reversed-phase chromatography, where a
nonpolar stationary phase is used together with a more polar solvent159. Similar
to the temperature programming in GC, gradient elution is used in LC. In
reversed-phase chromatography the solvent gradient is normally started with
mainly water followed by a gradual increase in the amount of organic solvent
(such as acetonitrile). The most common LC detectors are ultra violet detectors
(UV-vis) and mass spectrometers.
3.3.2. Mass spectrometry
The main principle of mass spectrometry is ionization of compounds at an ion
source, separation of the ions based on mass-to-charge ratio (m/z) in a mass
analyser and finally registration of the ions by a detector153. A common
ionization technique when GC is coupled to the MS is electron ionization (EI),
where high-energy electrons interact with the gaseous molecules. EI is a hard
ionization technique producing multiple fragments which makes it possible to
identify metabolites by comparing their fragmentation patterns. There are
extensive publicly available mass spectral libraries used for identification of EI
mass spectra. In contrast, when LC is coupled to the MS soft ionization
techniques are more commonly used. In soft ionization there is limited
fragmentation and the molecular ion is mainly observed. A high-resolution
25
instrument is needed for sufficient mass accuracy to be able to identify the
metabolites based on the molecular ion. Another strategy to facilitate the
identification is to use tandem MS (MS/MS) to obtain a higher degree of
fragmentation in the second MS step. An example of a soft ionization technique
is electrospray ionization (ESI), where a strong electric field is used to create an
aerosol of the liquid sample160. Electrospray ionization can be performed in
positive and negative mode and usually both modes are run for a wider
coverage.
After the ionization, the ions are separated in a mass analyser. One type of mass
analyser is the time-of-flight (TOF). In the TOF technique, the ions are
accelerated by an electrical field creating ions with the same kinetic energy and
then the ions enter a field-free region where the velocity is determined by
m/z153. Therefore, ions with a smaller m/z will travel through the field-free
region and reach the detector in a shorter time than ions with larger m/z.
Another type of mass analyser is the quadrupole, which contains four rods
between which a radio frequency field is applied and depending on the voltage
ions with specific m/z values can reach the detector160. Scanning is performed to
allow ions with all different m/z values to be detected. As the ions reach the
detector a signal is created with an intensity related to the abundance of the
specific ion.
3.3.3. GC-TOFMS analysis
Gas chromatography coupled to mass spectrometry (GC-MS) is a common
analytical method in metabolomics152,161,162. The main advantages of GC-MS
techniques compared to LC-MS techniques are i) the large number of
commercially available mass spectral libraries for identification, ii) less
problems with matrix effects, iii) the ability to use comparable retention indexes
(RI) during the identification and iv) the higher chromatographic
resolution116,160. The main drawbacks compared to LC-MS are i) the limited
range of detectable metabolites and ii) the need for derivatization160.
The GC-TOFMS analysis in this work was performed according to the protocol
developed by A et al.149. In short, the samples were automatically injected in
splittless mode. Using helium carrier gas, the samples were transported to a
fused-silica capillary column coated with a nonpolar stationary phase. A
temperature program with a rapid temperature increase was used. Following
the chromatographic separation, the eluted compounds were ionized with EI
and the ions separated using the TOF mass analyser followed by detection of the
derivatized compounds. For more details regarding the analysis see
supplementary material for Paper II. Examples of a chromatogram and an EI
mass spectrum from a GC-TOFMS analysis are shown in Figure 3 and Figure
4.
26
Figure 3. Example of a chromatogram from GC-TOFMS analysis of one plasma sample from Paper II. The retention times are shown along the x-axis and the peak intensities are shown along the y-axis.
Figure 4. Example of an EI mass spectrum from GC-TOFMS analysis from Paper II. The mass-to-charge ratios, m/z, of the ions are shown along the x-axis and the ion intensities are shown along the y-axis.
3.3.4. GCxGC-TOFMS analysis
Comprehensive two-dimensional gas chromatography coupled to mass
spectrometry (GCxGC-MS) has become more commonly used in different
applications in the metabolomics field due to its sensitivity and resolution
advantages over GC-MS163–166. Another advantage is the fact that the mass
spectral libraries available for GC-MS can be utilized in the identification
process. The drawbacks with GCxGC-MS are the more complex data which
makes the data processing more challenging155,167 and the often relatively long
analysis times.
The procedure for GCxGC-TOFMS analysis used in this work has been similar in
Papers I-III with some minor modifications. Briefly, the samples were
automatically injected in splittless mode with helium as carrier gas. The column
setup was a first-dimension polar column followed by a second-dimension
shorter, nonpolar column. A quad-jet thermal modulator was used between the
two columns with 5 s modulation time. The same, slow temperature program
was used in both columns with a temperature offset for the second column of
+15◦C. After the chromatographic separation the eluted metabolites were
ionized with EI and the ions were separated using the TOF mass analyser
27
followed by detection. An example of a surface plot chromatogram from
GCxGC-TOFMS analysis is shown in Figure 5. For more details regarding the
GCxGC-TOFMS analysis see Paper I.
Figure 5. Example of a surface plot chromatogram from GCxGC-TOFMS analysis of one plasma sample from Paper I. First- and second-dimension retention times (RT1 and RT2) on the x- and y-axes respectively and the peak intensities on the z-axis (visualized by a colour scale.
3.3.5. UHPLC-Q-TOFMS analysis
As mentioned above there are advantages of using LC-MS compared to GC-MS,
with the wider range of metabolites that can be detected being one of the most
central advantages. Due to these advantages LC-MS is a widely used technique
in metabolomics168–170. However, as discussed there are also drawbacks with LC-
MS, mainly associated with more problematic metabolite identification.
In the UHPLC-Q-TOFMS analysis in this work a reverse-phase C18 column was
used and gradient elution was performed with a mix of acetonitrile and
isopropanol as the organic solvent. ESI was used as ionization technique and the
samples were analysed in both positive and negative ionization mode. All
samples were analysed in MS mode and in addition one quality control (QC)
sample was run in auto MS/MS mode to aid in the identification process. For
more details regarding the UHPLC-Q-TOFMS analysis see Paper II.
3.3.6. Additional analytical aspects
An important aspect of metabolite analysis is the construction of the analytical
run order. For example, in an analytical run where all cases are analysed in the
first half of the run and all controls in the second half any true differences in
metabolite concentrations cannot be distinguished from differences due to
28
analytical drift. Therefore, attempts should be made prior to the analysis to
avoid these confounding effects. In this work, since all statistical analysis has
been on cohort level, the main strategy has been to randomize the samples
based on the sample groups of main interest. Furthermore, after the analysis all
putative metabolites showing a strong correlation to the analytical run order has
been excluded as a filtering step prior to the multivariate data analysis.
In the GC-MS and GCxGC-MS analyses an alkane series (from C8-C40) was
included in the run to be able to calculate retention indexes for use in the
identification processes. One of the advantages of GC-MS techniques with EI is
the ability to use comparable retention indexes instead of the more varying
retention times to compare metabolites analysed on different instruments.
During the sample extraction procedure a number of QC samples were created
by pooling aliquots from all or a selection of the samples. For the LC-MS
analysis the QC samples were prepared as a dilution series with five different
concentrations. The QC samples were then analysed in the beginning and end of
each run and also throughout the run (for example after every 10th sample). The
QC samples are important for evaluating the analytical procedure and they
should preferably display only minor variation compared to the biological
samples. In addition to the QC samples, blank samples (usually extraction mix
and water instead of sample) were included as method control all the way from
the metabolite extraction.
3.4. Processing of metabolomics data
The main purpose of the processing of metabolomics data is to convert the data
obtained from the metabolite analysis into a format suitable for data analysis
and evaluation (in this case multivariate modelling). In hyphenated
chromatography/mass spectrometry methods, the chromatogram and mass
spectra for each sample need to be converted into a table containing relative
metabolite concentrations with corresponding pure mass spectral profiles of the
metabolites for all samples. One of the difficulties in processing metabolomics
data is the presence of overlapping or co-eluting peaks in the chromatograms
which can lead to mass spectra with mixed signals. Therefore, a curve resolution
or deconvolution step is needed to resolve the chromatographic profiles for each
individual metabolite. In this work, several data processing methods have been
used, mainly depending on the analytical technique (see Table 3 for overview).
The processing methods will be described and discussed below with main focus
on the processing of GCxGC-TOFMS data since the majority of the data in this
work are acquired using that technique.
29
3.4.1. Processing of GCxGC-TOFMS data
For GCxGC-TOFMS data the addition of the second chromatographic dimension
makes the data more complex and thus the data processing more challenging. In
Paper I, a combination of two software was used for the data processing:
ChromaTOF (from LECO accompanying the GCxGC-TOFMS instrument) and
Guineu167 (VTT, Espoo, Finland, publicly available). Simply put, peaks need to
be detected and deconvoluted to obtain pure chromatographic and mass
spectral profiles. The idea behind deconvolution/curve resolution is to
mathematically separate overlapping chromatographic peaks using differences
in mass spectral profiles155. Even though the resolution is increased in two-
dimensional data compared to one-dimensional data there is still a need for
deconvolution of GCxGC-TOFMS data. For two-dimensional data, several peaks
from the same compound can elute in different second-dimension modulations
and therefore they need to be merged by spectral matching155. The steps
performed in the ChromaTOF software were baseline correction, peak
detection, mass spectra deconvolution, merging of second-dimension peaks,
automated mass spectral library searches for initial identification and
calculation of peak areas. In ChromaTOF all these steps were performed for one
sample at the time and resulted in one file per sample containing peak and mass
spectral information. To be able to use the data for multivariate data analysis
several additional steps were required such as alignment of peaks and matching
of metabolites between samples to allow between-sample comparisons. For this
the Guineu software was used, performing the following steps: alignment
including spectral similarity search, normalization with IS, removal of artefacts
and filtering of peaks detected in few samples. These steps resulted in one file
containing relative metabolite concentrations for all metabolites over all
samples along with mass spectral information. Even though an automated mass
spectra library search was performed in ChromaTOF, the average mass spectra
obtained from Guineu was used for manual investigation for confirmation of
putative metabolite identities (see chapter 3.4.4. Metabolite identification).
In Papers II and III, an in-house developed extension of the hierarchical
multivariate curve resolution (HMCR) methodology was used. The HMCR
methodology was originally developed for one-dimensional GC-MS data171 but
was here applied to two-dimensional data (2D HMCR). Compared to the regular
deconvolution/curve resolution procedure described above the HMCR approach
contains two important additional features, as reflected in the name, the
hierarchical and multivariate features171,172. The multivariate feature allows the
curve resolution to be performed simultaneously for all samples; this gives the
advantage that the same variables (putative metabolites) are found for all
samples. For one putative metabolite there will be a resolved chromatographic
profile for all samples, reflecting the relative metabolite concentration in each
sample, but only one common pure mass spectral profile, that can be used in the
30
identification. Therefore, no matching of metabolites between different samples
is needed. The hierarchical feature allows for curve resolution of complex GC-
MS data by dividing the chromatogram into time windows that are then
resolved separately. In 2D HMCR this was done taking both dimensions into
consideration (setting time windows using boxes). Steps included in the HMCR
processing are background reduction, smoothing of the mass channels,
alignment of samples (by finding the maximum covariance between the
chromatograms), division of the chromatogram into time windows and
multivariate curve resolution of each time window to obtain pure
chromatographic and mass spectral profiles171.
3.4.2. Processing of GC-TOFMS data
The GC-TOFMS data was processed using a targeted approach with a stand-
alone (Matlab) program developed in-house. In this processing an in-house
mass spectral library is used as the target. The fact that only the compounds
searched for can be detected can be considered both an advantage and a
disadvantage, since all compounds will have a putative identity but interesting
compounds, not included in the target library, will be missed. The steps
performed included: alignment (using multiple IS), calculation of RI, peak
detection, mass spectra deconvolution and mass spectra library targeting. After
the processing the detected peaks were investigated and a manual mass spectra
library search was performed for further confirmation of the putative
metabolites.
3.4.3. Processing of UHPLC-Q-TOFMS data
Processing of the UHPLC-Q-TOFMS data was performed using a combination
of the MassHunter software from Agilent. Qualitative Analysis was initially used
for Molecular Feature Extraction using chromatographic deconvolution. For
automatic feature extraction in all samples the DA Reprocessor software was
used. This resulted in one file for each sample containing all detected
compounds including molecular weight, retention time, m/z and abundance.
These files were imported to Mass Profiler Professional (MPP) for compound
alignment and filtering (to remove compounds present in few samples). A so-
called recursion step was then performed by importing the filtered compound
list into Qualitative Analysis and use the Find by Formula feature. The reduced
compound list works as a target list to search for these compounds in the
samples. This two-step process of molecular feature extraction and recursion
gives an improved reliability in the detected peaks. In addition, Quantitative
Analysis was used for quantification of IS. The QC samples were used to exclude
compounds with insufficient correlation to the QC dilution series.
31
3.4.4. Metabolite identification
Identification of metabolites is a crucial but often cumbersome process. To be
able to develop diagnostic tests and to make biological interpretations
metabolite identities are required. Another reason for identifying the
metabolites is to be able to highlight and remove unwanted peaks, such as
analytical artefacts, prior to the multivariate data analysis. As mentioned
previously, metabolites need to be derivatized prior to GC(xGC)-MS analysis.
Thus, there is an important distinction between identification in GC(xGC)-MS
and LC-MS data. In GC(xGC)-MS the derivatized metabolites are detected and
identified and therefore the identity must be converted to the original
metabolite. In LC-MS on the other hand no derivatization is needed and the
original metabolites are therefore detected and identified.
Metabolite identification in both GC-TOFMS and GCxGC-TOFMS usually
includes comparison of mass spectra to available mass spectral libraries, here
in-house and publicly available libraries (from the National Institute of Science
and Technology (NIST) and the Max Planck Institute in Golm) have been used.
For the mass spectral library search NIST MS Search 2.0 software can be used,
where the normalized dot product between the spectra is being searched and the
library spectra hit is calculated to give a match value where 999 corresponds to
an identical spectra. A match value above 800 is generally considered a good
identity. However, it can be difficult to set a general threshold for the match
value. It is important to note that all of the “identified” metabolites in this work
are putatively annotated compounds only, having used retention indexes and
mass spectra library matches in the identification173. Confirmatory identification
requires the analysis of compound standards117. In addition to the putatively
annotated metabolites there are also metabolites that have a putatively
annotated class and metabolites with unknown identity included in this work.
Even though identified metabolites are required to make biological
interpretations it can still be of interest to include unidentified metabolites.
Potential metabolites without proper library matches can be of special interest
in studies of bacterial infections since some metabolites might be of bacterial
origin and can be missing in the libraries. In this work, unidentified metabolites
have been compared between papers/cohorts to be able to find common
unidentified metabolites of interest (see chapter 3.7. Comparison of metabolites
between studies).
Metabolite identification in UHPLC-Q-TOFMS is more challenging than for GC-
MS-based techniques. The soft ionization techniques often used in LC-MS does
not produce any substantial fragmentation; instead the molecular mass can be
obtained. If the mass accuracy is high enough a limited number of hits can be
found in database searches (common databases are METLIN and MassBank)174.
32
Another, or often complementary, strategy is to perform MS/MS analysis to
increase the fragmentation, however these fragmentation patterns are not as
stable as the EI GC-MS spectra and will therefore differ between instruments175.
This adds another complication to the identification process.
3.5. Multivariate data analysis
In metabolomics, the number of measured metabolites is often much larger
than the number of samples. This creates a data table which is complex and
difficult to analyse based on classical statistical assumptions. Therefore, data
analysis methods are needed to be able to extract relevant information from this
type of data tables. Multivariate projection-based methods are commonly used
in metabolomics due to their ability to handle a large number of often correlated
variables that may contain missing values176. The main principle of these
projection methods is to reduce the multi-dimensional data (one dimension for
each variable) to low-dimensional models to facilitate visualization and
interpretation177. These models can aid in the extraction of latent
variables/biomarkers, a combination of several variables (metabolites) that
provides a stronger predictive pattern as compared to single variables. The
multivariate projection methods used in this work will be described in detail
below.
3.5.1. PCA
Principal component analysis (PCA)178 is a multivariate projection method used
to explain the systematic variation among a large number of variables (x
variables) in a lower number of new, independent variables, called principal
components or latent variables. These principal components contain
information about all the measured variables and also take into account the
correlation between the variables. It is therefore possible to find patterns that
would not be detected using classical statistical methods. PCA is an
unsupervised method, which means that no information about class belonging
or similar is given to the model a priori. In metabolomics analysis, a matrix X is
formed based on N rows, where each row represents one sample (for example a
human plasma sample) and K columns, where each column represents a
variable i.e. a metabolite (with the relative metabolite concentrations for each
sample). Principal components are calculated based on this matrix and in this
work the non-iterative partial least squares (NIPALS)179 method has been
applied. The first principal component will describe the largest variation in the
data and the second component, orthogonal to the first, will describe the second
largest variation and so on. For each principal component a score vector, t, is
calculated containing score values for all samples summarizing all original
variables. A corresponding loading vector, p, is also calculated for each principal
33
component containing weight values for each variable highlighting their
individual importance for the principal component.
The PCA model of X can be summarized according to the following equation:
X = TPT + E
where T is the score matrix (combination of the score vectors t), P is the loading
matrix (combination of the loading vectors p) and E is the residual matrix
containing the variation not explained by the model. Preferably, the residual
matrix should contain only unstructured noise.
Interpretation of a PCA model is usually done by plotting the scores and
loadings side by side. The score plot shows the distribution of the samples based
on the original variables where similarities and differences between the samples
can be found. In the loading plot the variables (metabolites) giving rise to the
pattern seen in the score plot can be found. Examples of score and loading plots
can be seen in Figure 6.
Figure 6. Example of a PCA model with data from Paper I comparing S. Typhi infection, S. Paratyphi A infection and afebrile controls. A) Score plot of the first two principal components (t[1] vs t[2]) where each point represents a patient sample and samples close together represents samples with similar metabolite patterns. A separation can be seen between the acute enteric fever samples and the afebrile controls, although not clearly along the first or second component separately. B) Corresponding loading plot of the two first principal components (p[1] and p[2]), contributing to the distribution of the samples seen in A, where each point represents a metabolite and the higher loading value a metabolite has the more it is contributing to the distribution in A. Top 5 enteric fever and afebrile control metabolites highlighted in filled and open blue circles respectively (i.e. metabolites having higher relative concentration in one group compared to the other).
34
3.5.2. OPLS
Multivariate regression methods are suitable when the measured variables
(metabolites) in matrix X can be related to other information in one or more y
variables, so-called response variables. A response variable can be qualitative,
such as information about sample groupings related to disease or patient sex,
but also qualitative, such as patient age or analytical run order. One well-known
multivariate regression method is partial least squares (PLS)180. However, in
this work the extension of PLS, orthogonal PLS (OPLS)181 was the method of
choice. OPLS and PLS are supervised methods since a priori information about
the samples (the response data) is used in the modelling. The OPLS components
are calculated in a somewhat similar way to PCA but instead of maximizing the
variation described in X the components in OPLS should maximize the co-
variation between X and Y at the same time as they should describe the
variation in X and Y individually in the best possible way. In OPLS, compared
to PLS, the X information is divided into two parts, one predictive part with
systemic variation in X that is related to Y and one orthogonal part with
systemic variation in X that is not related (orthogonal) to Y. This division
makes the visualization and interpretation of the models easier, since the
information of interest (the information related to the response) can be
distinguished from other types of systematic variation and thus investigated
separately.
The OPLS model can be summarized according to the following equations for X
and Y respectively:
X = TpPpT + ToP0
T + E
Y = TpCpT + F
where T is the score matrix, P is the loading matrix for X, C is the weight matrix
for Y and the predictive components are denoted with p and the orthogonal
components with o.
3.5.2.1. OPLS-DA
When the response data is qualitative, e.g. case/control information or similar,
classification of the samples can be performed using multivariate discriminant
analysis (DA). In the case of OPLS discriminant analysis (OPLS-DA)182 a so-
called dummy matrix with a binary description of sample class is used as the
response Y. In the OPLS-DA model the measured variables in the X matrix are
correlated to the response variables in the Y dummy matrix to obtain the
maximum separation between the a priori defined sample classes. In a similar
way as for PCA the OPLS-DA models can be interpreted in terms of related
score and loading plots. In this case the covariance loadings w*. Two ways of
35
visualizing scores and covariance loadings are shown in Figure 7. Due to
primary interest in the first predictive components of the OPLS-DA models the
representation in Figure 7C and D has been used in the description of Papers
I-III in chapter 4. Results.
Figure 7. Example of a 2-class OPLS-DA model with data from Paper I comparing S. Typhi infection and afebrile controls with two differing ways of visualizing scores and covariance loadings. A) Cross-validated scores for the first predictive and first orthogonal components (tcv[1] vs tocv[1]) showing the relationships between the samples similar to the PCA scores plot in Figure 6A. B) Corresponding covariance loadings of the first predictive and first orthogonal components (w*[1] and wo[1]) showing the distribution of metabolites similar to the PCA loadings in Figure 6B, here highlighting the top 5 S. Typhi and afebrile control metabolites in filled and open blue circles respectively (i.e. metabolites having higher relative concentration in one group compared to the other). C) Cross-validated scores for the first predictive component (tcv[1]) as an alternative way of visualizing scores when mainly one component is of interest. D) Covariance loadings for the first predictive component (w*[1]) corresponding to the scores in C, also highlighting the top 5 S. Typhi and afebrile control metabolites in filled and open blue circles respectively.
3.5.3. Use of multivariate techniques in the papers
In all papers, PCA has been used for peak investigation during the identification
process, overview of the data, detecting general trends and highlighting possible
outliers. In Paper I, PCA was also used to investigate the distribution of
analytical replicates. OPLS has been used to correlate the metabolites to the
36
analytical run order, to the age of the included patients but also to quantitative
clinical parameters available. The greater part of the multivariate modelling was
however done using OPLS-DA. In Papers I and III, 3-class OPLS-DA models
were used to get an additional overview of the data when having three classes (S.
Typhi, S. Paratyphi A and controls). However, the more detailed modelling has
been done using 2-class OPLS-DA models in all papers. These models have also
been used to highlight significant metabolites as potential biomarker patterns.
All data has been mean centred and scaled to unit variance prior to modelling.
3.6. Validation and interpretation of metabolomics data
Validation is a crucial part of metabolomics research and includes validation of
multivariate models and resulting biomarkers or biomarker patterns. In this
section, a few aspects of validation and interpretation in metabolomics analysis
will be described.
3.6.1. External and internal validation
For validation of multivariate models the different approaches can be divided
into two types, external and internal validation. In external validation additional
samples that were not involved in model calculations are used in the validation
step while the original model data itself is used in internal validation177.
Validation using an external dataset is preferred but not always possible due to
sample limitations. In this work, internal validation has mainly been used.
However, in Paper II a validation dataset was included and resulting
metabolites (potential biomarkers) have been compared across datasets which
will provide a type of external validation.
3.6.2. Cross-validation
Cross-validation (CV)183,184 is a type of internal validation used in multivariate
models to assess the predictive ability of the model and to determine the
suitable number of model components. In CV (here in the case of OPLS/OPLS-
DA models) the data is divided into several groups (here seven) and one group is
excluded and the model of the remaining groups is used to predict the response
values of the samples in the excluded group. These predicted values are then
compared to the samples’ original values where a model with good predictive
ability gives predicted values close to the true response values. This procedure is
repeated until all groups (i.e. all samples) have been excluded and their
response values predicted. The prediction error sum of squares (PRESS) is a
summarized measure of the squared differences in predicted and true response
values for all samples. The PRESS value is then used to calculate the Q2 value
which represents how much of the total variation in the response, Y, the model
can predict. The Q2 value can be used to determine the suitable number of
37
model components. A component can be considered significant if the Q2 value
increases by its addition. Q2 values are reported for all OPLS-DA models in this
work.
Q2 = 1 – PRESS/SS(Y)
The model score values are considered more reliable if the cross-validated
version is used. Thus the majority of the models in this work are visualized as
cross-validated scores. Another use of the cross-validated data is in the
significance testing of the multivariate models, using a CV version of analysis of
variance (CV-ANOVA)185. In CV-ANOVA the residuals (differences between true
and predictive response values) from the cross-validation is used in an F-test to
determine if these residuals are smaller compared to the residuals resulting
from variation around the global response average. A model is considered
significant if the resulting p-value from this test is less than 0.05. CV-ANOVA p-
values are reported for all OPLS-DA models in this work.
3.6.3. Investigation of raw data
Investigation of the raw data is an important step in the validation and
interpretation of the results from metabolomics analyses, especially when
highlighting specific metabolites of interest. An example of this, from Paper I,
is shown in Figure 8. Here the raw data of one metabolite significant in the
three pairwise comparisons of the sample groups in Paper I (patients with S.
Typhi infection, S. Paratyphi A infection and afebrile controls) was investigated
in three samples, one from each sample group (highlighted in Figure 8A).
Since the data was analysed with GCxGC-TOFMS both one- (Figure 8B) and
two-dimensional chromatographic peaks (Figure 8C-E) were investigated and
in both chromatograms a difference in peak area between the three groups
could be seen.
38
Figure 8. Example of raw data investigation from Paper I. The raw data for one metabolite (phenylalanine) that was significant for separating samples from enteric fever patients from afebrile controls, and also for separating S. Typhi infection samples from S. Paratyphi A infection samples was investigated. A) Score plot of a 3-class OPLS-DA model highlighting three samples selected for raw data investigation (enlarged symbols with blue outline). B) One-dimensional chromatographic peak of phenylalanine (mass: 218) with first and second retention times (RT1 & 2) along the x-axis and peak intensities along the y-axis showing differences in peak intensities between the sample groups. C-E) Two-dimensional chromatographic peaks of phenylalanine (mass 218) with first- (RT1) and second-dimension retention times (RT2) along the x- and y-axes respectively and peak areas (with colour intensities) along the z-axis for the S. Typhi sample (C), the S. Paratyphi A sample (D) and for the afebrile control (E) showing differences in peak areas between the sample groups.
3.6.4. Receiver operating characteristic curves
Receiver operating characteristic (ROC) curves is a useful tool for validating
diagnostic tests both clinically and in biomarker discovery research124,186. ROC
curves visualizes the sensitivity and specificity of the investigated diagnostic test
(in metabolomics studies one or several metabolites) for all possible cut-off
values for the classification of interest (discrimination between cases and
controls or similar)124. The sensitivity can be explained as the proportion of true
positive results among all test positive results and the specificity as the
proportion of true negative results among all test negative results. Depending on
the test application there can be different requirements for sensitivity and
specificity. A common way of interpreting ROC curves is to calculate the area
under the curve (AUC), where a value of 1.0 represents a perfect test and a value
of 0.5 represents a test that performs randomly. The AUC value in a ROC curve
has great similarities to the Mann-Whitney U-test187. In this work ROC curves
have been created for OPLS-DA model score values and for individual
metabolites. An example of a ROC curve from Paper I is shown in Figure 9.
39
3.6.5. Significant metabolites
Variable selection is a widely discussed subject within metabolomics analysis
and many different approaches exist, including both multivariate and univariate
methods188–191. Different aspects of univariate and multivariate methods are
reviewed by Saccenti et al. and one main conclusion is that multivariate and
univariate methods can be used complementary to determine metabolite
significance192. One important advantage of multivariate methods is that the
interaction or co-variation of metabolites is taken into consideration. In this
work, the main strategy has been to use multivariate models to highlight
significant metabolites. Univariate p-values have also been calculated but
generally not used in the variable selection. For significance, model covariance
loadings, w*, from 2-class OPLS-DA models have been used. The model
covariance loadings relate the measured variables to the true response (Y). In
Papers I and II, the significance limit of w* was either a fixed value or
calculated from the average and standard deviation for each model. In Paper
III, the recently developed latent significance approach was used (Jonsson et
al., in preparation). In the latent significance approach the unique benefit of
OPLS, the ability to model and subtract orthogonal variation, is utilized also in
the variable selection making this truly multivariate. Latent model covariance
loadings, wlatent, can be calculated by removing the orthogonal variation, found
through OPLS, to focus on the variation relevant to the response. An additional
advantage in the latent significance approach is that statistical limits for wlatent
can be used for determining significant metabolites, something that cannot be
done with the original model covariance loadings. For the univariate tests, the
Student´s t-test and/or the Mann-Whitney U-test have been used depending on
the sample size and distribution (both with a significance limit of 0.05). When
significant metabolites have been highlighted these metabolites have been
Figure 9. Example of ROC curves with data from Paper I. ROC curves for cross-validated scores from OPLS-DA models of S. Paratyphi A compared to afebrile controls based on 46 and 6 metabolites along with the best individual metabolite. False positive rates (i.e. 1-specificity) and true positive rates (i.e. sensitivity) are shown along the x- and y-axes respectively. In general, the closer the curve is to the upper left corner the better. AUC values for the respective curves are, 46 met: 0.999, 6 met: 0.948, top met: 0.884.
40
compared to other metabolites from other models within the same cohort or to
models in other cohorts (see chapter 3.7. Comparison of metabolites between
studies). Further, models with a reduced number of metabolites have mainly
been based on metabolites common between models/cohorts.
3.6.6. Biological interpretation
It is a complex task to provide a relevant biological interpretation of
metabolomics data. Most metabolites are involved in a number of biochemical
processes and it is often difficult to obtain a complete overview. There are many
databases or software available for different types of pathway analysis that can
be used as tools for the biological interpretation. In this work, the focus has not
been on making thorough biological interpretations of the metabolites and their
interactions. Instead the focus has been on showing the potential of the used
methodology and to compare metabolites between different cohorts. Therefore,
pathway analysis has only been performed to a low degree and the metabolites
highlighted as important in the studies have only been briefly investigated.
However, the work and results presented here provides great opportunities for
further, more detailed, biological interpretation of metabolite patterns related to
enteric fever. Such information would be useful in the development of
diagnostic tests but also to provide more insight into the mechanisms of disease.
3.7. Comparison of metabolites between studies
It can be of great value to investigate the robustness of detected metabolites.
One way of doing this is to compare metabolites between similar
studies/cohorts/models. In this work, the interesting metabolites have been
compared both within and between cohorts. For comparing metabolites
between cohorts analysed with GC-TOFMS or GCxGC-TOFMS, the initial step
has been to include the metabolite mass spectra for the previous cohorts in the
library search during the identification process to be able to find matches
between both identified and unidentified metabolites between the cohorts.
Metabolites found significant have then been compared to the metabolites from
cohorts of interest looking at degree of significance and direction of change. All
comparisons made between the studies/cohorts in this thesis are listed in Table
4. Metabolites were also compared within cohorts, for example the cohorts in
Papers I and III including three pairwise comparisons (S. Typhi vs. Control, S.
Paratyphi A vs. Control and S. Typhi vs. S. Paratyphi A). If or when a set of
common metabolites expressing the same alteration in several cohorts have
been detected corresponding OPLS-DA models can be calculated based on these
common metabolites in order to investigate the robustness of the detected
metabolite pattern. Examples of such models can be found in Paper II.
41
Table 4. Comparison of significant metabolites between cohorts.
Paper/cohort Paper/cohort compared to Model compared to
Paper II: Acute typhoid.
Bangladeshi cohort
Paper I: Acute enteric. Nepali cohort S. Typhi vs. Control
Paper II: Acute typhoid.
Bangladeshi + Senegalese
validation cohort
Paper I: Acute enteric. Nepali cohort
Paper II: Acute typhoid. Bangladeshi
cohort
S. Typhi vs. Control
S. Typhi vs. Control
Paper III: Chronic
carriers. Nepali cohort
Paper I: Acute enteric. Nepali cohort S. Typhi+S. Paratyphi A
vs. Control
42
4. Results
4.1. Paper I
Salmonella Typhi and Salmonella Paratyphi A elaborate
distinct systemic metabolite signatures during enteric
fever
There is a need for improved diagnostic tests for enteric fever3. The hypothesis
is that the host-pathogen interactions can be responsible for generating specific
metabolite patterns in the blood of enteric fever patients that potentially could
be used to distinguish the enteric fever patients from patients with other
infections. Salmonella Typhi and Salmonella Paratyphi A are closely related
bacteria that cause a very similar disease in humans. However, due to
differences in epidemiology, antimicrobial susceptibility and ability of
vaccination it would be useful to be able to distinguish between them9,10,83. Due
to their genetic and biochemical differences these two bacteria may potentially
give rise to unique metabolite patterns in the blood making it possible to also
differentiate between the causative agents of enteric fever.
4.1.1. Objective
The objective of this first metabolomics study targeting enteric fever was to
investigate the metabolite patterns in plasma samples from patients in Nepal
with blood culture-confirmed acute S. Typhi infection, acute S. Paratyphi A
infection along with afebrile controls. The study setup and a brief overview of
the methods used are shown in Figure 10.
Figure 10. Overview of study design and methods used in Paper I.
43
4.1.2. Results
OPLS-DA modelling comparing metabolite patterns for acute enteric fever and
afebrile controls revealed a significant separation for both S. Typhi and S.
Paratyphi A from control (Figure 11A and B). A separation, although not as
clear, could also be found comparing the two causative agents of enteric fever, S.
Typhi and S. Paratyphi A (Figure 11C).
Figure 11. Cross-validated scores for the first predictive component (tcv[1]p) in 2-class OPLS-DA models of acute enteric fever and afebrile controls based on 695 metabolites (error bars: mean score value with 95% confidence interval). A) S. Typhi samples significantly separated from afebrile controls (Q2= 0.824, p=4.1*10-20). B) S. Paratyphi A significantly separated from afebrile controls (Q2= 0.815, p=4.2*10-18). C) S. Typhi samples separated from S. Paratyphi A samples (with some overlapping samples) (Q2= 0.140, p=0.062).
The metabolites contributing to the detected separations were investigated for
the three pairwise comparisons and metabolites significant in i) S. Typhi versus
S. Paratyphi A and ii) S. Typhi and/or S. Paratyphi A versus afebrile controls
were highlighted. This comparison resulted in 46 metabolites that could
separate acute enteric fever from afebrile controls but also S. Typhi from S.
Paratyphi A. OPLS-DA models of the three pairwise comparisons based on these
46 metabolites all resulted in significant separations, even for S. Typhi versus S.
Paratyphi A (Q2=0.420, p=2.6*10-6).
To further investigate the diagnostic potential of the obtained metabolite
pattern, ROC curves were constructed using the scores (estimated and cross-
validated) from the OPLS-DA models based on 46 metabolites, along with the
individual metabolites, for each comparison. The ROC curves (Figure 12)
showed that the detected metabolite pattern gave AUC values ≥0.9 in all
comparisons, indicating a high diagnostic potential. The metabolite pattern gave
higher AUC values compared to the individual metabolites (the metabolite with
the highest AUC value for each comparison is shown in Figure 12). Finally, a
combination of 6 putatively identified or classified metabolites (ethanolamine,
gluconic acid, monosaccharide, phenylalanine, pipecolic acid and saccharide)
from the 46 was shown to retain a high diagnostic potential, with significant
OPLS-DA models and AUC values ≥0.8 in all comparisons (Figure 12).
44
Figure 12. ROC curves for cross-validated scores from 2-class OPLS-DA models based on 46 and 6 metabolites along with the best individual metabolite. A) S. Typhi vs. afebrile control, AUC values for CV-scores 46 met: 0.996, CV-scores 6 met: 0.923 and top metabolite (phenylalanine): 0.925. B) S. Paratyphi A vs. afebrile control, AUC values for CV-scores 46 met: 0.999, CV-scores 6 met: 0.948 and top metabolite (phenylalanine): 0.884. C) S. Typhi vs. S. Paratyphi A, AUC values for CV-scores 46 met: 0.898, CV-scores 6 met: 0.796 and top metabolite (2-phenyl-2-hydroxypropanoic acid): 0.693.
4.1.3. Discussion and Comments
The objective of this first metabolomics study, targeting enteric fever, was to
investigate the metabolite patterns related to acute enteric fever to see if
differences could be detected in plasma samples of patients with acute disease
compared to afebrile controls. After analysis with GCxGC-TOFMS and OPLS-
DA significant differences could be detected between enteric fever and the
afebrile controls (for S. Typhi and S. Paratyphi A respectively). Interestingly a
separation, although weaker, could be detected between the S. Typhi and S.
Paratyphi A infections. Decreasing the number of metabolites gave significant
separations between S. Typhi and S. Paratyphi A (in models based on 46 and 6
common metabolites). Although S. Typhi and S. Paratyphi A cause a similar
disease from a clinical perspective there are differences between the serovars
that suggests that it would be valuable to effectively discriminate between them.
There has been reports on differences in antimicrobial susceptibility between S.
Typhi and S. Paratyphi A with higher MICs, and higher likelihood for resistance
reported for S. Paratyphi A9,83. Another important difference between S. Typhi
and S. Paratyphi A is that there is no licenced vaccine against S. Paratyphi A92.
Apart from clinical isolation of the organisms there has been few other ways of
separating S. Typhi from S. Paratyphi A and therefore the results presented in
this study are promising. To further investigate and validate the differences in
metabolite patterns between S. Typhi and S. Paratyphi A additional, larger
studies would be required.
A large number of metabolites were found significant for separating the enteric
fever samples from afebrile controls (for both Salmonella serovars) and a
smaller number of metabolites were significant for separating S. Typhi from S.
Paratyphi A. In order to highlight more Salmonella-specific markers, the
45
metabolites with the ability to distinguish between patients with infections
caused by the two Salmonella serovars were selected initially and among them
the metabolites able to distinguish enteric fever patients from afebrile controls
were included in the final set. These 46 metabolites included metabolites with a
putative identity, a putative metabolite class and also unidentified metabolites.
ROC curves were constructed for the metabolite panels (from OPLS-DA model
scores) in the three pairwise comparisons as well as for the single metabolites.
This showed that the metabolite panels in general had higher diagnostic
potential compared to the single best metabolite, which visualized the strength
of using a combination of correlated metabolites, i.e. a latent biomarker, instead
of individual metabolites. As a final step, a metabolite panel consisting of a
smaller number of metabolites that still had a high diagnostic potential was
searched for among the 46 highlighted metabolites. A panel of 6 metabolites
was found (using only putatively identified or classified metabolites) which
resulted in significant OPLS-DA models as well as ROC curves with AUC values
above 0.8 for the enteric fever samples compared to controls and an AUC value
of 0.8 for the comparison of S. Typhi and S. Paratyphi A.
In future studies it would be of great interest to include other febrile diseases
with clinical representation similar to enteric fever to be able to find biomarkers
more specific to typhoidal Salmonella. A strength in this study is the fact that
the two closely related bacteria S. Typhi and S. Paratyphi A could be
distinguished. This further highlights the potential of the used methodology. In
summary, this study has shown the possibilities of using metabolite patterns to
distinguish between plasma samples from patients with acute enteric fever and
afebrile controls and also for separating S. Typhi from S. Paratyphi A. In
addition, the strength of using a combination of metabolites as a biomarker
pattern compared to single metabolites was highlighted. This proof-of-concept
study for metabolomics targeting enteric fever is a first step towards using
metabolomics in the search for diagnostic biomarkers for enteric fever.
46
4.2. Paper II
Reproducible diagnostic metabolites in plasma from
typhoid fever patients in Asia and Africa
The results from Paper I showed that it was possible to find differences in
metabolic patterns in relation to acute enteric fever. To further explore the
metabolic patterns, a second study was conducted, this time focusing only on
Salmonella Typhi. The current gold standard method of diagnosing enteric fever
is blood culture. However, the sensitivity is varying and often too low (from 40
to 80%52,102). Thus, it would be of interest to investigate how samples from
patients with a negative blood culture but with clinically suspected typhoid
behaves in relation to samples from culture-positive typhoid patients.
4.2.1. Objective
The objectives of the second study was to i) investigate metabolite patterns in
plasma from Bangladeshi patients with culture confirmed acute typhoid,
patients culture-negative but with clinically suspected typhoid and febrile
controls, ii) validate findings from Paper I and iii) investigate the diagnostic
potential in corresponding urine samples. In addition, a separate validation
cohort, including samples from Bangladesh and Senegal with a malaria
subgroup as part of the controls, was investigated. The study setup and a brief
overview of the methods used are given in Figure 13.
Figure 13. Overview of study design and methods used in Paper II.
47
4.2.2. Results
Plasma samples from patients culture-positive for typhoid were significantly
separated from the febrile controls based on their metabolite patterns in an
OPLS-DA model (Figure 14A). Using the model to predict the culture-
negative/clinically suspected typhoid samples resulted in a diverse picture with
five out of nine culture-negative typhoid samples predicted as culture-positive
typhoid and three out of nine as controls (one indifferent) (Figure 14B).
Possibly, indicating which of the culture-negative samples that were true
typhoid samples. Interestingly, three of the five samples predicted as culture-
positive typhoid also had a positive PCR amplification for S. Typhi. Urine
samples were also analysed giving similar results as the plasma samples
regarding the separation of culture-positive S. Typhi from febrile controls,
although with some differences in the prediction of the culture-negative typhoid
samples (see supplementary material for Paper II).
Figure 14. 2-class OPLS-DA model of culture-positive typhoid and febrile controls with prediction of culture-negative/clinically suspected typhoid from the Bangladeshi cohort based on 236 plasma metabolites. A) Cross-validated scores for the first predictive component (tcv[1]p) showing a significant separation of culture-positive S. Typhi from febrile controls (Q2=0.598, p=5.9*10-3). B) Predicted scores for the first predictive component (tPS[1]) showing 5/9 clinically suspected S. Typhi samples clearly resembling culture-positive S. Typhi and 3/9 more resembling febrile controls. Clinically suspected S. Typhi samples with a positive PCR amplification are marked with * inside the symbol.
The metabolites separating the culture-positive typhoid samples from the febrile
controls in the Bangladeshi cohort were compared to the metabolites separating
the S. Typhi samples from the afebrile controls in the Nepali cohort from Paper
I. Significant metabolites with the same direction of change in the two studies
were highlighted. Two criteria for significance were used in Paper II, w*>0.03
and w*> ±SD. For the first criteria, 33 metabolites were highlighted as
consistently regulated between the studies. For the second, stricter criteria 15
metabolites were retained. OPLS-DA models based on these 15 metabolites
resulted in significant separations between the culture-positive typhoid samples
and the febrile controls from the Bangladeshi cohort (Figure 15A) and the
afebrile controls from the Nepali cohort (Figure 15B).
48
Figure 15. Cross-validated scores for the first predictive component (tcv[1]p) in 2-class OPLS-DA models of culture-positive typhoid and febrile/afebrile controls based on 15 plasma metabolites consistently regulated in the Bangladeshi cohort and the Nepali cohort (from Paper I) (error bars: mean score value with 95% confidence interval). A) S. Typhi samples significantly separated from febrile controls in the Bangladeshi cohort (Q2=0.535, p=0.016). B) S. Typhi samples significantly separated from afebrile controls in the Nepali cohort (Q2=0.722, p=4.8*10-16).
In addition to the Bangladeshi cohort, a validation cohort with patient samples
from both Bangladesh and Senegal including culture confirmed S. Typhi
infection and febrile controls with malaria or infections caused by other
pathogens was analysed. The febrile controls were combined into one class and
compared to the S. Typhi samples which resulted in an OPLS-DA model where
the typhoid samples could be significantly separated from the febrile controls,
although with one sample from the other pathogens group clearly overlapping
with the typhoid group (Figure 16A). Since malaria has clinical features often
resembling typhoid fever, it is of great importance to be able to distinguish
patients with typhoid fever from malaria patients. Therefore, these two groups
were compared in a separate OPLS-DA model resulting in a significant
separation of the S. Typhi samples from the malaria samples (Figure 16B).
As a final step, the metabolites significant for separating i) the S. Typhi samples
from febrile controls (including malaria and infections by other pathogens) in
the Bangladeshi/Senegalese validation cohort, ii) the S. Typhi samples from
febrile controls in the primary Bangladeshi cohort and iii) the S. Typhi samples
from afebrile controls in the Nepali cohort from Paper I were compared.
Comparing degree of significance and direction of change for the metabolites
resulted in 24 metabolites being consistently regulated in the
Bangladeshi/Senegalese validation cohort and the primary Bangladeshi cohort
and/or the Nepali cohort. OPLS-DA models of the consistently regulated
metabolites resulted in a significant separation of S. Typhi from the respective
controls samples in the Bangladeshi/Senegalese cohort (with some overlap)
(Figure 17A) and in the Nepali cohort (Figure 17C) and a weaker model in the
primary Bangladeshi cohort (Figure 17B).
49
Figure 16. Cross-validated scores for the first predictive component (tcv[1]p) in 2-class OPLS-DA models of culture-positive typhoid and febrile controls (including malaria and infections from other pathogens) from the Bangladeshi and Senegalese validation cohort based on 104 plasma metabolites (error bars: mean score value with 95% confidence interval). A) S. Typhi samples significantly separated from febrile controls including malaria and infections from other pathogens (with one clearly overlapping other pathogen sample) (Q2=0.404, p=7.2*10-5). B) S. Typhi samples significantly separated from malaria samples (Q2=0.613, p=1.3*10-4).
Figure 17. Cross-validated scores for the first predictive component (tcv[1]p) in 2-class OPLS-DA models of culture-positive typhoid and febrile/afebrile controls based on plasma metabolites consistently regulated in the primary Bangladeshi cohort, the Bangladeshi and Senegalese validation cohort and the Nepali cohort (from Paper I) (error bars: mean score value with 95% confidence interval). A) S. Typhi samples significantly separated from febrile controls including malaria and infections from other pathogens in the Bangladeshi and Senegalese validation cohort (with a few overlapping samples) (Q2=0.404, p=7.2*10-5). B) S. Typhi samples separated from febrile controls in the Bangladeshi cohort (with some overlapping samples) (Q2=0.227, p=0.39). C) S. Typhi samples significantly separated from afebrile controls in the Nepali cohort (Q2=0.598, p=2.5*10-11).
4.2.3. Discussion and Comments
The first objective of this study was to investigate the metabolite patterns of
culture-positive typhoid fever and febrile controls and to see how a group of
culture-negative, but clinically suspected typhoid fever samples, would behave
since this is a challenging group to diagnose. Analysis by GCxGC-TOFMS and
multivariate data analysis resulted in a significant separation of the culture-
positive typhoid samples from the febrile controls. Through prediction, a
heterogeneity in the behaviour of the culture-negative typhoid samples was
found with some culture-negative samples behaving similarly to culture-positive
typhoid and others behaving similarly to controls. Although the true nature of
these samples is difficult to confirm in retrospect the theory of the typhoid-
50
predicted culture-negative samples being true typhoid samples was somewhat
strengthened by a positive PCR amplification of S. Typhi for three of the
samples. Due to the low sensitivity of blood culturing many of the culture-
negative typhoid samples could be true typhoid cases but they could also be
affected by other infections. Studies from Nepal have reported Rickettsia spp. as
a common cause of infection in culture-negative typhoid cases193. Rickettsial
infections were also suspected as the cause of some of the culture-negative
typhoid cases in other Nepali studies where differences in treatment efficacy
between two antimicrobials were detected comparing culture-positive and
culture-negative typhoid cases83,85. In addition, differences in frequencies of
certain clinical features between culture-positive and negative typhoid samples
were also observed83 something that was not clearly seen in the investigation of
the clinical data in this study. The obvious heterogeneity of the culture-negative
typhoid population makes it a difficult but important group to investigate.
The second objective was to validate the findings from Paper I by comparing
metabolites significant for the separation of S. Typhi samples from
febrile/afebrile controls. A large number of consistently regulated metabolites
were found in the two studies. The six highlighted metabolites in Paper I were
not discussed more in comparison to the metabolites in Paper II due to the fact
that these metabolites were selected to separate S. Typhi from S. Paratyphi A as
well as enteric fever from control, thus they were shifted somewhat towards the
Typhi/Paratyphi comparison. Only the metabolites significant in the S. Typhi vs.
control model in Paper I were used in the comparison. In addition to the
consistently regulated metabolites found during the comparison there were also
inconsistently regulated metabolites, i.e. a metabolite could have a higher
relative concentration in the S. Typhi group in one cohort and in the control
group in the other cohort. The presence of inconsistently regulated metabolites
might be related to the difference in the control groups (febrile controls in
Paper II and afebrile in Paper I). Taking this into account there is a strength
in the consistently regulated metabolites since they are significant for separating
S. Typhi samples from controls for two different types of control groups. It
should also be noted that different data processing methods were used in
Papers I and II (see chapter 3.4. Processing of metabolomics data), however
this will probably not have a great impact on the result as a whole but can cause
small changes in individual metabolites.
As a second validation step, a separate validation cohort from Bangladesh and
Senegal was analysed, this time with GC-TOFMS. The cohort included culture-
confirmed S. Typhi samples along with febrile controls, including both malaria
and infections caused by other pathogens. Again, a number of consistently
regulated metabolites were found with the criteria of being significant in the
51
Bangladeshi/Senegalese validation cohort and the primary Bangladeshi cohort
and/or the Nepali cohort. A limitation of the study is the small size of the
cohorts but at the same time there is a strength in verifying metabolite patterns
in several cohorts with different characteristics. In addition, the presence of a
malaria control group in the Bangladeshi/Senegalese validation cohort was used
to investigate the difference in metabolite patterns between typhoid and malaria
samples. It is of importance to investigate the metabolite patterns of enteric
fever in relation to additional febrile illnesses with clinical features resembling
typhoid fever. The S. Typhi samples could be differentiated from the malaria
samples. An interesting observation in the typhoid versus malaria model was
the fact that three of the five typhoid samples from Senegal were overlapping
somewhat into the malaria group (Figure 16B). This overlap could possibly be
indicative of a co-infection of typhoid and malaria however despite further
investigation this could not be verified. See chapter 4.4.2. Comparing enteric
fever and malaria, for more discussion about the metabolites in the
typhoid/malaria comparison.
A third, smaller objective was to investigate the diagnostic potential for typhoid
fever in urine. Therefore, urine samples from the primary Bangladeshi cohort
were analysed with UHPLC-QTOFMS. A significant separation of culture
positive typhoid samples from febrile controls was obtained similar to what was
seen in the plasma samples. Although the prediction of the culture-
negative/clinically suspected samples differed somewhat compared to plasma.
Since the main focus here was on the plasma samples and due to the more
complicated identification process for LC-MS data the metabolites for the urine
samples were not investigated in more detail. However, it would be of interest to
compare the metabolite patterns in plasma and urine in the context of typhoid
fever diagnosis. Urine sampling would be beneficial for use in diagnostic tests in
areas where there may be a negative cultural attitude towards blood sampling.
In summary, this study has i) shown the possibility of using metabolite patterns
to separate patients with culture confirmed typhoid fever from febrile controls
(also including malaria samples), ii) shown a novel approach to investigate
culture-negative but clinically suspected typhoid cases and iii) validated a set of
metabolites across different patient cohorts with the potential of becoming
diagnostic biomarkers for acute enteric fever. Taken together these results
represent a second step on the way towards improving the diagnostic methods
of acute enteric fever using metabolite patterns.
52
4.3. Paper III
Metabolite biomarkers of typhoid chronic carriage
A distinctive feature of S. Typhi and S. Paratyphi A infections is that they can
cause an often asymptomatic chronic infection22,194 following an episode of acute
enteric fever or even with no documented history of acute disease13. A common
site for the colonization of the bacteria during chronic infection is the
gallbladder195 which has been shown to be associated with biofilm formation on
gallstones27. Carriers of the chronic infection can shed bacteria through the
faeces and thereby contribute to the transmission of enteric fever195. Being able
to detect these chronic carriers is therefore of great importance196. Existing
methods for detection of chronic carriers of S. Typhi and S. Paratyphi A
infection suffer from limitations regarding sensitivity, usefulness in endemic
areas and high invasiveness20.
4.3.1. Objective
The objectives of the third study were to i) investigate the metabolite patterns in
plasma samples from patients in Nepal undergoing cholecystectomy with
confirmed gallbladder carriage of either S. Typhi or S. Paratyphi A together with
non-carrier controls and ii) compare the metabolite patterns during chronic
carriage compared to the patterns related to acute enteric fever presented in
Paper I. The study setup and a brief overview of the methods used are given in
Figure 18.
Figure 18. Overview of study design and methods used in Paper III.
53
4.3.2. Results
In the initial OPLS-DA modelling of the plasma metabolite patterns, the two
carrier groups were combined into one and compared to the non-carriage
controls. This resulted in a significant separation of the Salmonella carriage
samples from the non-carriage controls (Figure 19A). The S. Typhi and S.
Paratyphi A carriage samples were then compared to the non-carriage controls
in individual models also resulting in significant discrimination (Figure 19C
and D). Further the two carrier groups were compared to each other indicating
a difference between the S. Typhi and S. Paratyphi A carriage (Figure 19B).
However, this model was weaker as compared to the other models and the small
sample number in the S. Paratyphi A group should be considered.
Figure 19. Cross-validated scores for the first predictive component (tcv[1]p) in 2-class OPLS-DA models of Salmonella carriage and non-carriage controls based on 195 metabolites (error bars: mean score value with 95% confidence interval). A) Salmonella carriage samples significantly separated from non-carriage controls (Q2=0.613, p=2.8*10-6). B) S. Typhi carriage samples separated from S. Paratyphi A carriage samples (with some overlapping samples) (Q2=0321, p=0.070). C) S. Typhi carriage samples significantly separated from non-carriage controls (Q2=0.607, p=3.1*10-5). D) S. Paratyphi A carriage samples significantly separated from non-carriage controls (Q2=0.630, p=3.7*10-4).
Investigating the metabolite distribution in the individual carriage models,
revealed some interesting patterns. In the model comparing S. Typhi carriage
samples and non-carriage controls the distribution of metabolites were
54
somewhat shifted towards the non-carriage controls, meaning that more
metabolites had a higher relative concentration in the non-carrier controls
compared to the S. Typhi carriage samples (Figure 20A). In the model
comparing S. Paratyphi A carriage samples to non-carriage controls a more even
distribution of the metabolites was seen between the two sample groups
(Figure 20B).
Figure 20. Model covariance loadings for the first predictive component (w*[1]) in 2-class OPLS-DA models of Salmonella carriage and non-carriage controls based on 195 metabolites. A) Covariance loadings showing a shift towards the non-carriage controls compared to the S. Typhi carriage samples. B) Covariance loadings showing a more equal distribution between the S. Paratyphi A carriage samples and the non-carriage controls.
A comparison was made between the metabolites significant for separating
Salmonella carriage samples from non-carriage controls to the metabolites
significant for separating the acute enteric fever samples from afebrile controls
in Paper I. The degree of significance and direction of change was compared
and the Venn diagram in Figure 21A summarizes the metabolites and their
alterations in the different comparisons. A number of metabolites were
consistently regulated in carriage and acute infection (with three being
increased in the Salmonella group and seven being increased in the control
group). A larger number of metabolites were inconsistently regulated in carriage
and acute infection (with 16 being increased in the Salmonella group in acute
infection and in the control group in the carriage samples). Most of the
metabolites where significant in only one of the studies. Of those we highlighted
five metabolites that were only significant in the carriage samples and increased
in the Salmonella group (Figure 21B). Two of the metabolites could be
assigned a putative annotation, glutaric acid and hexanoic acid, while the others
were considered unknown. In an OPLS-DA model using only the five
metabolites, the Salmonella carriage samples could be significantly
distinguished from the non-carriage controls although with a few overlapping
samples (Figure 21C).
55
Figure 21. Comparison of metabolites between acute infection and carriage. A) Venn diagram showing consistently regulated metabolites in acute infection (Paper I) and carriage in the overlapping circles, inconsistently regulated metabolites in acute infection and carriage in the overlapping ellipses and metabolites only significant in either study in the reminder part of the circles. The squared box highlights metabolites of particular interest by being significant (2) or present (3) only in the carriage dataset. B) Table of the five highlighted metabolites from the Venn diagram. RI: first-dimension retention indexes, Chronic: T/P indicating an increase in relative concentration in the Salmonella carriage samples compared to the non-carriage controls, Acute: n.s. indicating a non-significant metabolite and “-“ a metabolite not present. C) Cross-validated scores for the first predictive component (tcv[1]p) in 2-class OPLS-DA models showing a significant separation of Salmonella carriage samples from non-carriage controls based on five metabolites (Q2=0.604, p=1.4*10-7). Error bars represent mean score value with 95% confidence interval.
4.3.3. Discussion and Comments
The first objective of this study was to investigate metabolite patterns related to
chronic Salmonella carriage in the gallbladder. Plasma samples analysed with
GCxGC-TOFMS and OPLS-DA revealed a significant separation of Salmonella
carriage samples from non-carriage controls both with the S. Typhi and S.
56
Paratyphi A carriage groups combined and separated. A model indicating a
separation between the S. Typhi and S. Paratyphi A carriage samples was also
obtained, although not as strong as the models comparing carriage samples and
non-carriage controls. A similar pattern was seen in Paper I with significant
separations of acute S. Typhi and S. Paratyphi A infection respectively from
controls and a weaker separation of S. Typhi infection from S. Paratyphi A
infection. However, due to the very small sample size in the S. Paratyphi A
carriage group the results of the models with separate carriage groups should be
interpreted carefully.
Differences in the distribution of metabolites in the separate S. Typhi and S.
Paratyphi A models were detected. More metabolites had a higher relative
concentration in the non-carriage group compared to the S. Typhi group and
this pattern was not seen in the S. Paratyphi A model. A speculation is that this
metabolite shift can be related to a Vi polysaccharide capsule that is present in
S. Typhi bacteria but absent in S. Paratyphi A bacteria. Studies have shown a
relationship between the presence of the Vi capsule and a suppression of the
inflammatory response during infection48,197 something that can be related to
the stealth features of S. Typhi and the ability to cause a chronic carrier state.
The decrease in relative concentration of many metabolites in the S. Typhi
carriage samples compared to the non-carriage controls could possibly be
related to this suppression. Despite this difference between S. Typhi and S.
Paratyphi A, both do cause a similar acute disease and both can establish a
chronic carriage state. The use of the Vi capsule is only one of the possible
mechanisms unique for the typhoidal Salmonella bacteria. It is interesting that
these differences may possibly be reflected in the metabolite patterns.
The second objective of this study was to compare the metabolite patterns
observed during chronic carriage to the patterns related to acute enteric fever in
Paper I. For this, the latent significance method was used for highlighting
significant metabolites in the carrier data as well as re-applied to the data from
Paper I to obtain a reliable comparison. In the latent significance approach
(see chapter 3.6.5. Significant metabolites), the orthogonal variation is removed
which gives more focus on the variation related to the sample groups. Using the
latent significance approach a statistical significance limit can be set on the
latent model covariance loadings. The models with the S. Typhi and S. Paratyphi
A groups combined were used both for chronic carriage and acute infection. The
comparison resulted in a small number of consistently regulated metabolites,
somewhat more inconsistently regulated metabolites and a larger number of
metabolites only significant in acute enteric fever or in chronic carriage. All of
the inconsistently regulated metabolites showed a higher relative concentration
in the Salmonella group during acute enteric fever but a higher relative
concentration in the control group during chronic carriage. Using the individual
57
models for S. Typhi and S. Paratyphi A for the same comparison showed that
the majority of the inconsistently regulated metabolites originated from the S.
Typhi model. In general, there seem to be a reduced level of many metabolites
in the S. Typhi carriage group both in relation to the S. Paratyphi A carriage
group and to the acute S. Typhi infection group. Again, this might be related to
the Vi capsule/inflammatory response suppression discussed above, considering
that a suppressed inflammatory response would be of benefit for S. Typhi
during the “silent” chronic stage. It should be noted that this is merely a
speculation that needs further detailed investigation.
The five metabolites significant during chronic carriage only and increased in
the Salmonella group were of particular interest. These five metabolites
represent possible diagnostic biomarkers for Salmonella carriage and therefore
an OPLS-DA model based only of these five metabolites was calculated.
Although there were some overlap between the sample classes, the model was
able to significantly separate Salmonella carriage samples from non-carriage
controls.
In summary, this study has shown the possibility of detecting differences in
metabolite patterns between patients undergoing cholecystectomy with and
without confirmed Salmonella gallbladder carriage. A small set of metabolites
were highlighted as potential diagnostic biomarkers for Salmonella chronic
carriage that were unique to chronic carriage compared to acute enteric fever.
This small study represents a first step towards improving the diagnostic
methods for detecting chronic Salmonella carriers. Here the study population
consisted of patients with a gallbladder condition and it cannot be ruled out that
this or other underlying factors might affect the detected metabolite patterns.
The condition was however the same for both carriers and non-carriers and
since there is a higher risk for Salmonella carriage with for example
cholecystitis28 there is a value in investigating carriers in this setting. Using a
similar population, it would be of interest to include patients with growth of
other pathogens in their gallbladder for detection of more Salmonella-specific
metabolites. Ideally, asymptomatic carriers should be the target group but then
a larger population would need to be screened for detection of chronic carriers.
It can be debated if it is practically feasible to detect and treat all chronic
carriers, considering the difficulties in screening an asymptomatic population,
the large amounts of antimicrobials needed for treatment and the invasive
nature of gallbladder removal. However, it is clear that the chronic carriers are
an important group to consider since they contribute to disease transmission.
58
4.4. Metabolites and metabolite patterns
A few additional aspects of the metabolite patterns of interest from Papers I-III will be presented here. 4.4.1. Metabolites affected during acute enteric fever
Significant metabolite patterns during acute enteric fever have been found in
both Papers I and II. Here metabolites consistently regulated in at least two of
the cohorts are considered. Among the consistently regulated metabolites, the
majority had a higher relative concentration in the enteric fever group
compared to the controls. For individual metabolite classes a few general trends
could be seen. For carbohydrates/saccharides there was a general decrease in
the enteric fever group, something that might be related to the high-energy
requirements during infection. There was a general increase in the levels of
many fatty acids in the enteric fever group (with stearic acid being increased in
all three cohorts). An increase in fatty acids has also been reported in the studies
of other infections, such as dengue, malaria and sepsis128,129,133. Even though this
increase in fatty acids might be a sign of general infection, the individual
metabolites changed and the levels might differ for different infections. For the
amino acids, most were increased in the enteric fever group (such as
phenylalanine, leucine, isoleucine and valine) while others were decreased
(tryptophan and lysine). Tyrosine was inconsistently regulated. Increased levels
of phenylalanine and valine and decreased levels of tryptophan were also seen in
the studies of other infections. Usually a trend of decreasing amino acid levels is
seen during infection, however the changes in amino acids can differ depending
on the type of infection198. Phenylalanine is known to increase during many
infections and the decrease of tryptophan has been seen previously in enteric
fever198. Examples of other consistently regulated metabolites are glycerol-3-
phosphate, glyceric acid and 1-monostearoylglycerol. There are several
interesting metabolites related to acute enteric fever that would be of interest to
investigate further, together with several unidentified metabolites that might be
of interest for future studies.
4.4.2. Comparing enteric fever and malaria
A significant separation in metabolite patterns between acute enteric fever and
malaria was found in Paper II. There were some similarities in the observed
metabolite pattern compared to the models of acute enteric fever versus control
seen previously (in the cohorts in Papers I and II). As discussed above there
were several metabolites found increased during acute enteric fever that were
also found in studies of other infections, including malaria. A few of these
metabolites were however found increased in the S. Typhi group compared to
malaria. Two examples are phenylalanine and stearic acid that were shown to
increase during malaria in previous studies129,130 but increased in the enteric
59
fever group both compared to controls and malaria alone in this work. A few
metabolites that were decreased in the malaria groups in the studies mentioned
above were also decreased in the malaria group in the model comparing enteric
fever and malaria. One example is alanine. There were also differences in the
metabolite pattern in the enteric fever versus malaria model compared to the
other models of acute enteric fever in this work. Among them were glucose- and
fructose-6-phosphate, that were increased in the enteric fever group compared
to malaria and non-significant in the comparison to the other control subgroup
in the same cohort. The similarities in metabolite patterns between the models
comparing enteric fever to malaria and to the other controls shows that there
are metabolites with the possibility of distinguishing enteric fever from febrile
controls including malaria. The differences in metabolite patterns between
enteric fever and malaria could be of interest for further investigations into
specific differences between the infections.
4.4.3. Comparing acute enteric fever and chronic carriage
In the comparison of metabolites related to acute enteric fever and chronic
carriage, a general trend could be seen for fatty acids. Shorter-chain fatty acids
were either inconsistently regulated or significant during chronic carriage only
while many longer-chain and unsaturated fatty acids where significant during
acute infection only. Among the metabolites consistently regulated in acute
infection and chronic carriage glycerol-3-phosphate showed a relative higher
concentration in the Salmonella group, not only in this comparison but in all
cohorts included in this work. An unidentified metabolite was also consistently
regulated across all studies but had a higher relative concentration in the
control group. Among the metabolites only significant during chronic carriage
the most interesting where the five metabolites mentioned previously, including
glutaric acid, hexanoic acid and three unidentified. A possible connection to
fermentation of the gut microbiome could be made for hexanoic acid and
glutaric acid. There were several metabolites only significant during chronic
carriage but with a higher relative concentration in the non-carriage controls. A
majority of these metabolites were unidentified but among the identified or
classified there were a few monosaccharides. Many of the metabolites found
increased in the Salmonella group during acute infection (in Papers I and II)
were not found significant during chronic carriage (in Paper III). This suggests
that there are differences in metabolite responses during acute enteric fever and
chronic carriage.
60
5. Conclusions and Future perspectives
There is a need for improved diagnostic methods for enteric fever with emphasis
on diagnosing patients with acute enteric fever and for detecting chronic
carriers. Metabolomics has become a popular tool in the search for diagnostic
biomarkers. One underlying hypothesis within metabolomics is that a particular
disease, in this case enteric fever, will give rise to a specific metabolite pattern
that can be detected in biological samples. In this thesis, a mass spectrometry-
based metabolomics approach has been used to investigate the metabolite
patterns in plasma and urine from patients with acute enteric fever along with
chronic Salmonella carriers. The results summarized in this work highlight the
usefulness of applying metabolomics to enteric fever.
In Papers I and II, differences in metabolite patterns between acute enteric
fever and afebrile/febrile controls were detected. The presence of consistently
regulated metabolites between the cohorts suggests that there is a shift in
several metabolites during acute enteric fever that is independent of the patient
cohort. In Paper II, the novel approach to investigate the challenging patient
group of blood culture-negative but clinically suspected enteric fever highlights
the problems related to blood culture diagnosis. In addition, the detected
differences between enteric fever and malaria puts focus on the importance of
including samples from febrile diseases with clinical features similar to enteric
fever in the search for Salmonella-specific metabolite biomarkers.
In Paper III, differences in metabolite patterns between Salmonella carriage
samples and non-carriage controls were detected. The results of comparing
metabolite patterns between Papers I and III suggests that there are more
differences than similarities in the metabolite patterns during acute enteric
fever and chronic carriage. In Papers I and III, differences in metabolite
patterns between the two causative agents of enteric fever, S. Typhi and S.
Paratyphi A, were detected. This could be of interest in relation to the reported
regional differences in antimicrobial susceptibility for the two serovars and the
absence of a vaccine for S. Paratyphi A. The presence of the Vi capsule in S.
Typhi but not in S. Paratyphi A was possibly reflected by differences in the
distribution of the metabolites in the separate S. Typhi and S. Paratyphi A
models compared to controls in Paper III.
A key factor for future studies is to include additional febrile diseases with
clinical representation similar to enteric fever. This would increase the
possibility of finding enteric fever-specific metabolites. In this thesis the main
sample groups have been enteric fever and different types of controls. However,
in Paper II there was a control subgroup of malaria patients, which could be
61
differentiated from enteric fever. In addition, the S. Typhi and S. Paratyphi A
samples could be differentiated, which can be considered a strength of the used
methodology. For the future, larger studies in terms of sample numbers would
be preferred. However, the study design is crucial and a well-designed study can
often afford a smaller sample size. For detection of chronic carriers, a large
asymptomatic population would preferably be screened instead of only focusing
on carriers with a gallbladder condition. This would however be a challenging
task. It would also be of interest to include carriers of other pathogens. In this
thesis, non-targeted metabolomics has been used. Further non-targeted
metabolomics studies of enteric fever could be of interest but it would also be
interesting to use a targeted approach based on the results presented in this
thesis. One specific area could be to investigate the effect of the S. Typhi Vi
capsule.
In summary, the work presented in the thesis has i) highlighted the potential of
using metabolomics in the search for diagnostic biomarker patterns for enteric
fever, focusing both on acute infection and chronic carriers, ii) suggested
interesting metabolite patterns that can be considered as potential diagnostic
biomarkers, although more work is required to investigate and validate these
findings, and iii) provided clues for further investigations of the mechanisms of
enteric fever. In addition, the strength of using a combination of metabolites as
a latent biomarker compared to single biomarkers has been emphasized.
Altogether, this thesis has contributed to the work towards improved diagnostic
methods for different stages of enteric fever.
62
6. Acknowledgements Tack till alla som på något sätt bidragit till att jag kunnat göra den här
avhandlingen. Jag skulle vilja rikta ett speciellt tack till:
Henrik för att du gav mig möjligheten att vara en del av din forskningsgrupp
först genom exjobb och projektarbete och sedan som doktorand. Du har alltid
uppmuntrat mig och trott på mig även om jag inte alltid gjort det själv. Jag har
lärt mig otroligt mycket under de här åren. Anders för att du har varit min
biträdande handledare. Du har lärt mig mycket om infektionssjukdomar och
trots att vi inte alltid har träffats så ofta så har det känts bra att diskutera
projektet med dig och få nya perspektiv på saker. Stephen, thank you for
introducing me to the fascinating world of enteric fever and for letting us
collaborate with you on this interesting project. I am also grateful for your help
in turning our sometimes a bit “boring” manuscript texts into better versions.
Nuvarande och tidigare medlemmar i Henriks grupp: Lina för att vi har följts åt
under en stor del av doktorandtiden, undervisat på många kurser, turistat i
London och framför allt delat kontor under alla år. Anna för att du är en sådan
engagerad person och för att man alltid lär sig något nytt när man pratar med
dig. Vi hade trevligt när vi delade kontor på plan 5. Elin C och Elin T för att ni
har varit bra förebilder. Som Elin nummer tre i gruppen blev jag lill-Elin för att
jag var nyast och även kortast. Pär för hjälp med processering,
modelleringsidéer och när jag försökt göra egna Matlab-skript. Benny för
intressanta diskussioner om GCxGC bland mycket annat när vi delade kontor.
Kristina, min nuvarande rumskompis, för trevliga samtal, även om jag inte
alltid har varit så pratsam mitt uppe i avhandlingsskrivandet. Olena, Magnus
och Tommy för trevliga diskussioner. Also thanks to Ilona, it is nice to have
you at our group meetings and we had a great time sightseeing in SF.
Alla på SMC som hjälpt till på olika sätt: Inga-Britt och Krister som visat allt
från extrahering till hur man kör instrumenten. Jonas och Annika för hjälp
med planering av analyser och med instrumenten. Hans för förklaring av
processeringsmetoder. Tack även till Thomas, Maria, Jenny, Pernilla och
Joakim. Alla nuvarande och tidigare medarbetare i ”kemometrikorridoren”:
Matilda, Frida, Daniel, Tomas, Rickard, David, Izabella, Beatriz och
Johan.
Peter Haglund för möjligheten att analysera prover på GCxGC-instrumentet
och för hjälp vid analyserna. Thanks to Konstantin Kouremenos for
teaching me a lot about GCxGC and for all the time we spend with “Peggy”. Alla
trevliga medarbetare på kemiska institutionen. Ett speciellt tack till
administratörer och tekniker för hjälp med allt från kurser, intyg,
anställningsfrågor och reseräkningar till datorer: Maria, Lars-Göran, Rosita,
63
Sara, Carina, Barbro m.fl. Madeleine Ramstedt och Jan Larsson, i min
referensgrupp, för stöd och visat intresse under alla uppföljningsmöten.
Andra medarbetare: Thanks to the group in Oxford for collaboration on the
typhoid challenge-project (although not included in this thesis), a special thanks
to Claire for the nice dinner in Boston and Helene for nice discussions about
the data. Maria, Hans och Anna på molekylärbiologin för samarbetet med
MRSA/MSSA-studien under mina exjobb. Alicia, Anna och Daniel på klinisk
mikrobiologi för diskussioner om metabolomik och för trevligt sällskap på
konferenser. Thank you to all of the co-authors on the papers. Thank you to
everyone who has read and commented my thesis during the writing process.
Alla som började kemistprogrammet 2007, vi var inte så många men vi hade
trevligt tillsammans: Cecilia för att du har varit min vän sedan första dagen vi
träffades för tio år sedan, för att vi är lika på så många sätt, inte minst att vi
båda kommer från Jämtland och för allt roligt vi har haft under studenttiden,
under doktorandtiden och utanför universitetet. Mari för att du var den första
jag pratade med den första dagen på universitetet och för all den roliga tid som
följde, speciellt när vi bodde i samma studentkorridor. Det är alltid lika roligt att
ses trots att det inte blir så ofta nu. Marcus (mer längre ner), Kalle, Viktor
och John för alla timmar i föreläsningssalar, på labb och på roliga aktiviteter
under studenttiden. Tack även till övriga kemister och TNK:are, speciellt
Jennifer för det roliga första året och för att vi hade det trevligt när vi träffades
igen efter många år, nu tillsammans med våra småttingar.
Vännerna från Strömsund: Marlene för att vi blev bästa vänner i första klass
och har hängt ihop sedan dess, för ditt sällskap under din tid i Umeå och för att
det alltid är lika kul att ses när jag är ”hemma”, nuförtiden blir det full fart med
våra små tjejer. Therese för den trevliga tiden under gymnasiet och alla
pratstunder, nu blir det dock inte lika mycket tid till prat när vi ses på grund av
tre små busfrön. Lina för din positiva energi och för att du skämtat om
Nobelpris (det har i alla fall blivit en avhandling!).
Släkten: Annbritt och Bernt, Madelene, Sture, Ronny, Irene, Catrine
och Johnny med familjer för alla trevliga sammankomster innehållande
mycket kortspel. Irene, Sven-Olov, Simon och Alexander i Finspång för att
ni välkomnat mig till er familj.
Mamma och Pappa för att ni alltid finns där, för ert stöd, för att ni peppar mig
på ert eget sätt (”räkna inte med att det kommer gå så här bra sen”) och för allt
roligt vi har tillsammans. Marcus för ditt stöd, för att du kan ta beslut när jag
velar, för att du har en inställning att allt ordnar sig och för att vi passar så bra
ihop. Matilda, vår lilla solstråle, för att du finns, för att du får oss att skratta
och häpna varje gång du lär dig något nytt.
64
7. References
1. Lozano, R. et al. Global and regional mortality from 235 causes of death for 20 age groups in 1990 and 2010: a systematic analysis for the Global Burden of Disease Study 2010. The Lancet 380, 2095–2128 (2012).
2. Parry, C. M., Wijedoru, L., Arjyal, A. & Baker, S. The utility of diagnostic tests for enteric fever in endemic locations. Expert Rev. Anti Infect. Ther. 9, 711–725 (2011).
3. Baker, S., Favorov, M. & Dougan, G. Searching for the elusive typhoid diagnostic. BMC Infect. Dis. 10, 45 (2010).
4. Bhan, M. K., Bahl, R. & Bhatnagar, S. Typhoid and paratyphoid fever. The Lancet 366, 749–762 (2005).
5. Woods, C. W. et al. Emergence of Salmonella enterica serotype Paratyphi A as a major cause of enteric fever in Kathmandu, Nepal. Trans. R. Soc. Trop. Med. Hyg. 100, 1063–1067 (2006).
6. Ochiai, R. L. et al. Salmonella Paratyphi A Rates, Asia. Emerg. Infect. Dis. 11, 1764–1766 (2005).
7. Sahastrabuddhe, S., Carbis, R., Wierzba, T. F. & Ochiai, R. L. Increasing rates of Salmonella Paratyphi A and the current status of its vaccine development. Expert Rev. Vaccines 12, 1021–1031 (2013).
8. Arndt, M. B. et al. Estimating the Burden of Paratyphoid A in Asia and Africa. PLoS Negl. Trop. Dis. 8, (2014).
9. Maskey, A. P. et al. Salmonella enterica serovar Paratyphi A and S. enterica serovar Typhi cause indistinguishable clinical syndromes in Kathmandu, Nepal. Clin. Infect. Dis. 42, 1247–1253 (2006).
10. Karkey, A. et al. Differential Epidemiology of Salmonella Typhi and Paratyphi A in Kathmandu, Nepal: A Matched Case Control Investigation in a Highly Endemic Enteric Fever Setting. PLoS Negl Trop Dis 7, e2391 (2013).
11. Vollaard AM, Ali S, van Asten HH & et al. Risk factors for typhoid and paratyphoid fever in jakarta, indonesia. JAMA 291, 2607–2615 (2004).
12. World Health Organization. Dept. of Vaccines and Biologicals. Background document : the diagnosis, treatment and prevention of typhoid fever. (2003).
13. Parry, C. M., Hien, T. T., Dougan, G., White, N. J. & Farrar, J. J. Typhoid Fever. N. Engl. J. Med. 347, 1770–1782 (2002).
14. Wain, J., Hendriksen, R. S., Mikoleit, M. L., Keddy, K. H. & Ochiai, R. L. Typhoid fever. The Lancet (2014). doi:10.1016/S0140-6736(13)62708-7
15. Steele, A. D., Hay Burgess, D. C., Diaz, Z., Carey, M. E. & Zaidi, A. K. M. Challenges and Opportunities for Typhoid Fever Control: A Call for Coordinated Action. Clin. Infect. Dis. Off. Publ. Infect. Dis. Soc. Am. 62, S4–S8 (2016).
65
16. Reddy, E. A., Shaw, A. V. & Crump, J. A. Community-acquired bloodstream infections in Africa: a systematic review and meta-analysis. Lancet Infect. Dis. 10, 417–432 (2010).
17. Connor, B. A. & Schwartz, E. Typhoid and paratyphoid fever in travellers. Lancet Infect. Dis. 5, 623–628 (2005).
18. Crump, J. A., Luby, S. P. & Mintz, E. D. The global burden of typhoid fever. Bull. World Health Organ. 82, 346–353 (2004).
19. Buckle, G. C., Walker, C. L. F. & Black, R. E. Typhoid fever and paratyphoid fever: Systematic review to estimate global morbidity and mortality for 2010. J. Glob. Health 2, (2012).
20. Gunn, J. S. et al. Salmonella chronic carriage: epidemiology, diagnosis, and gallbladder persistence. Trends Microbiol. 22, 648–655 (2014).
21. Levine, M. M., Black, R. E. & Lanata, C. Precise estimation of the numbers of chronic carriers of Salmonella typhi in Santiago, Chile, an endemic area. J. Infect. Dis. 146, 724–726 (1982).
22. Mortimer, P. P. Mr N the milker, and Dr Koch’s concept of the healthy carrier. The Lancet 353, 1354–1356 (1999).
23. Brooks, J. The sad and tragic life of Typhoid Mary. CMAJ Can. Med. Assoc. J. 154, 915–916 (1996).
24. Merrell, D. S. & Falkow, S. Frontal and stealth attack strategies in microbial pathogenesis. Nature 430, 250–256 (2004).
25. Wain, J. et al. Quantitation of Bacteria in Bone Marrow from Patients with Typhoid Fever: Relationship between Counts and Clinical Features. J. Clin. Microbiol. 39, 1571–1576 (2001).
26. van Velkinburgh, J. C. & Gunn, J. S. PhoP-PhoQ-Regulated Loci Are Required for Enhanced Bile Resistance in Salmonella spp. Infect. Immun. 67, 1614–1622 (1999).
27. Crawford, R. W. et al. Gallstones play a significant role in Salmonella spp. gallbladder colonization and carriage. Proc. Natl. Acad. Sci. U. S. A. 107, 4353–4358 (2010).
28. Vaishnavi, C. et al. Epidemiology of Typhoid Carriers among Blood Donors and Patients with Biliary, Gastrointestinal and Other Related Diseases. Microbiol. Immunol. 49, 107–112 (2005).
29. Caygill, C. P. J., Hill, M. J., Braddick, M. & Sharp, J. C. M. Cancer mortality in chronic typhoid and paratyphoid carriers. The Lancet 343, 83–84 (1994).
30. Luxemburger, C. et al. Risk factors for typhoid fever in the Mekong delta, southern Viet Nam: a case-control study. Trans. R. Soc. Trop. Med. Hyg. 95, 19–23 (2001).
31. Olsen, S. J. et al. Outbreaks of typhoid fever in the United States, 1960-99. Epidemiol. Infect. 130, 13–21 (2003).
66
32. Dongol, S. et al. The Microbiological and Clinical Characteristics of Invasive Salmonella in Gallbladders from Cholecystectomy Patients in Kathmandu, Nepal. PLoS ONE 7, e47342 (2012).
33. Santos, R. L. et al. Animal models of Salmonella infections: enteritis versus typhoid fever. Microbes Infect. 3, 1335–1344 (2001).
34. Parkhill, J. et al. Complete genome sequence of a multiple drug resistant Salmonella enterica serovar Typhi CT18. Nature 413, 848–852 (2001).
35. Sabbagh, S. C., Forest, C. G., Lepage, C., Leclerc, J.-M. & Daigle, F. So similar, yet so different: uncovering distinctive features in the genomes of Salmonella enterica serovars Typhimurium and Typhi. FEMS Microbiol. Lett. 305, 1–13 (2010).
36. Mathur, R. et al. A Mouse Model of Salmonella Typhi Infection. Cell 151, 590–602 (2012).
37. Fang, F. C. & Bäumler, A. J. A Tollgate for Typhoid. Cell 151, 473–475 (2012).
38. Waddington, C. S. et al. An Outpatient, Ambulant-Design, Controlled Human Infection Model Using Escalating Doses of Salmonella Typhi Challenge Delivered in Sodium Bicarbonate Solution. Clin. Infect. Dis. ciu078 (2014). doi:10.1093/cid/ciu078
39. Jones, C., Darton, T. C. & Pollard, A. J. Why the development of effective typhoid control measures requires the use of human challenge studies. Microb. Immunol. 5, 707 (2014).
40. Guerrant, R. L. & Steiner, T. S. Principles and Syndromes of Enteric Infection. in Mandell, Douglas, and Bennett´s Principles and Practice of Infectious Diseases 1215–1285 (Elsevier/Churchill Livingstone, 2005).
41. Tsolis, R. M., Young, G. M., Solnick, J. V. & Bäumler, A. J. From bench to bedside: stealth of enteroinvasive pathogens. Nat. Rev. Microbiol. 6, 883–892 (2008).
42. Dougan, G. & Baker, S. Salmonella enterica Serovar Typhi and the Pathogenesis of Typhoid Fever. Annu. Rev. Microbiol. 68, 317–336 (2014).
43. Galán, J. E. Typhoid toxin provides a window into typhoid fever and the biology of Salmonella Typhi. Proc. Natl. Acad. Sci. U. S. A. 113, 6338–6344 (2016).
44. Raffatellu, M. et al. Capsule-Mediated Immune Evasion: a New Hypothesis Explaining Aspects of Typhoid Fever Pathogenesis. Infect. Immun. 74, 19–27 (2006).
45. Kindt, T. J., Goldsby, R. A. & Osborne, B. A. Innate Immunity. in Kuby Immunology 52–71 (W. H. Freeman and Company, 2007).
46. de Jong, H. K., van der Poll, T. & Wiersinga, W. J. The Systemic Pro-Inflammatory Response in Sepsis. J. Innate Immun. 2, 422–430 (2010).
47. Butler, T., Ho, M., Acharya, G., Tiwari, M. & Gallati, H. Interleukin-6, gamma interferon, and tumor necrosis factor receptors in typhoid fever
67
related to outcome of antimicrobial therapy. Antimicrob. Agents Chemother. 37, 2418–2421 (1993).
48. Wilson, R. P. et al. The Vi-capsule prevents Toll-like receptor 4 recognition of Salmonella. Cell. Microbiol. 10, 876–890 (2008).
49. Song, J., Gao, X. & Galán, J. E. Structure and function of the Salmonella Typhi chimaeric A2B5 typhoid toxin. Nature 499, 350–354 (2013).
50. Jones, B., Pascopella, L. & Falkow, S. Entry of microbes into the host: using M cells to break the mucosal barrier. Curr. Opin. Immunol. 7, 474–478 (1995).
51. de Jong, H. K., Parry, C. M., van der Poll, T. & Wiersinga, W. J. Host–Pathogen Interaction in Invasive Salmonellosis. PLoS Pathog. 8, e1002933 (2012).
52. Wain, J. et al. Quantitation of Bacteria in Blood of Typhoid Fever Patients and Relationship between Counts and Clinical Features, Transmissibility, and Antibiotic Resistance. J. Clin. Microbiol. 36, 1683–1687 (1998).
53. Klotz, S. A., Jorgensen, J. H., Buckwold, F. J. & Craven, P. C. Typhoid Fever: An Epidemic With Remarkably Few Clinical Signs and Symptoms. Arch. Intern. Med. 144, 533–537 (1984).
54. Azmatullah, A., Qamar, F. N., Thaver, D., Zaidi, A. K. & Bhutta, Z. A. Systematic review of the global epidemiology, clinical and laboratory profile of enteric fever. J. Glob. Health 5, (2015).
55. Bhutta, Z. A. Current concepts in the diagnosis and treatment of typhoid fever. BMJ 333, 78 (2006).
56. Bergh, E. T. A. M., Hussein Gasem, M., Keuter, M. & Dolmans, M. V. Outcome in three groups of patients with typhoid fever in Indonesia between 1948 and 1990. Trop. Med. Int. Health 4, 211–215 (1999).
57. Stuart, B. M. & Pullen, R. L. Typhoid; clinical analysis of 360 cases. Arch. Intern. Med. Chic. Ill 1908 78, 629–661 (1946).
58. Huckstep, R. Typhoid fever and other salmonella infections. (E.&S. Livingstone, 1962).
59. Gotuzzo, E. et al. Association Between the Acquired Immunodeficiency Syndrome and Infection With Salmonella typhi or Salmonella paratyphi in an Endemic Typhoid Area. Arch. Intern. Med. 151, 381–382 (1991).
60. Vinh, H. et al. Two or three days of ofloxacin treatment for uncomplicated multidrug-resistant typhoid fever in children. Antimicrob. Agents Chemother. 40, 958–961 (1996).
61. Bhutta, Z. A. Impact of age and drug resistance on mortality in typhoid fever. Arch. Dis. Child. 75, 214–217 (1996).
62. Crump, J. A. & Mintz, E. D. Global Trends in Typhoid and Paratyphoid Fever. Clin. Infect. Dis. 50, 241–246 (2010).
63. Olarte, J. & Galindo, E. Salmonella typhi resistant to Chloramphenicol, Ampicillin, and Other Antimicrobial Agents: Strains Isolated During an
68
Extensive Typhoid Fever Epidemic in Mexico. Antimicrob. Agents Chemother. 4, 597–601 (1973).
64. Paniker, C. K. J. & Vimala, K. N. Transferable Chloramphenicol Resistance in Salmonella typhi. Nature 239, 109–110 (1972).
65. Butler, T., Arnold, K., Linh, N. & Pollack, M. Chloramphenicol-resistant typhoid fever In Vietnam associated with R factor. The Lancet 302, 983–985 (1973).
66. Threlfall, E. J. et al. Widespread occurrence of multiple drug-resistant Salmonella typhi in India. Eur. J. Clin. Microbiol. Infect. Dis. 11, 990–993 (1992).
67. Hoa, N. T. et al. Community-acquired septicaemia in southern Viet Nam: the importance of multidrug-resistant Salmonella typhi. Trans. R. Soc. Trop. Med. Hyg. 92, 503–508 (1998).
68. Spread of multiresistant Salmonella typhi. The Lancet 336, 1065–1066 (1990).
69. Kariuki, S., Gilks, C., Revathi, G. & Hart, C. A. Genotypic analysis of multidrug-resistant Salmonella enterica Serovar typhi, Kenya. Emerg. Infect. Dis. 6, 649–651 (2000).
70. Rowe, B., Ward, L. R. & Threlfall, E. J. Multidrug-Resistant Salmonella typhi: A Worldwide Epidemic. Clin. Infect. Dis. 24, S106–S109 (1997).
71. Chinh, N. T. et al. A Randomized Controlled Comparison of Azithromycin and Ofloxacin for Treatment of Multidrug-Resistant or Nalidixic Acid-Resistant Enteric Fever. Antimicrob. Agents Chemother. 44, 1855–1859 (2000).
72. White, N. J., Dung, N. M., Vinh, H., Bethell, D. & Hien, T. T. Fluoroquinolone antibiotics in children with multidrug resistant typhoid. The Lancet 348, 547 (1996).
73. Cao, X. T. et al. A comparative study of ofloxacin and cefixime for treatment of typhoid fever in children. The Dong Nai Pediatric Center Typhoid Study Group. Pediatr. Infect. Dis. J. 18, 245–248
74. Wallace, M. R. et al. Ciprofloxacin versus ceftriaxone in the treatment of multiresistant typhoid fever. Eur. J. Clin. Microbiol. Infect. Dis. 12, 907–910 (1993).
75. Burkhardt, J. E., Hill, M. A., Carlton, W. W. & Kesterson, J. W. Histologic and Histochemical Changes in Articular Cartilages of Immature Beagle Dogs Dosed with Difloxacin, a Fluoroquinolone. Vet. Pathol. 27, 162–170 (1990).
76. Doherty, C. P., Saha, S. K. & Cutting, W. A. Typhoid fever, ciprofloxacin and growth in young children. Ann. Trop. Paediatr. 20, 297–303 (2000).
77. Wain, J. et al. Quinolone-resistant Salmonella typhi in Viet Nam: molecular basis of resistance and clinical response to treatment. Clin. Infect. Dis. 25, 1404–1410 (1997).
78. Brown, J. C., Shanahan, P. M., Jesudason, M. V., Thomson, C. J. & Amyes, S. G. Mutations responsible for reduced susceptibility to 4-quinolones in
69
clinical isolates of multi-resistant Salmonella typhi in India. J. Antimicrob. Chemother. 37, 891–900 (1996).
79. Threlfall, E., Ward, L., Skinner, J., Smith, H. & Lacey, S. Ciprofloxacin-resistant Salmonella typhi and treatment failure. The Lancet 353, 1590–1591 (1999).
80. Murdoch, D. A. et al. Epidemic ciprofloxacin-resistant Salmonella typhi in Tajikistan. The Lancet 351, 339 (1998).
81. Effa, E. E. et al. Fluoroquinolones for treating typhoid and paratyphoid fever (enteric fever). Cochrane Database Syst. Rev. CD004530 (2011). doi:10.1002/14651858.CD004530.pub4
82. Ryan, E. T. Troubling news from Asia about treating enteric fever: a coming storm. Lancet Infect. Dis. 16, 508–509 (2016).
83. Thompson, C. N. et al. Treatment Response in Enteric Fever in an Era of Increasing Antimicrobial Resistance: An Individual Patient Data Analysis of 2092 Participants Enrolled into 4 Randomized, Controlled Trials in Nepal. Clin. Infect. Dis. 64, 1522–1531 (2017).
84. Pandit, A. et al. An Open Randomized Comparison of Gatifloxacin versus Cefixime for the Treatment of Uncomplicated Enteric Fever. PLoS ONE 2, (2007).
85. Arjyal, A. et al. Gatifloxacin versus ceftriaxone for uncomplicated enteric fever in Nepal: an open-label, two-centre, randomised controlled trial. Lancet Infect. Dis. 16, 535–545 (2016).
86. Clasen, T., Schmidt, W.-P., Rabie, T., Roberts, I. & Cairncross, S. Interventions to improve water quality for preventing diarrhoea: systematic review and meta-analysis. BMJ 334, 782 (2007).
87. Waddington, C. S., Darton, T. C. & Pollard, A. J. The challenge of enteric fever. J. Infect. 68, Supplement 1, S38–S50 (2014).
88. Typhoid vaccines: WHO position paper. Wkly. Epidemiol. Rec. 83, 49–59 (2008).
89. Tacket, C. O. et al. Safety and immunogenicity of two Salmonella typhi Vi capsular polysaccharide vaccines. J. Infect. Dis. 154, 342–345 (1986).
90. Black, R. E. et al. Efficacy of one or two doses of Ty21a Salmonella typhi vaccine in enteric-coated capsules in a controlled field trial. Chilean Typhoid Committee. Vaccine 8, 81–84 (1990).
91. Kossaczka, Z. et al. Safety and Immunogenicity of Vi Conjugate Vaccines for Typhoid Fever in Adults, Teenagers, and 2- to 4-Year-Old Children in Vietnam. Infect. Immun. 67, 5806–5810 (1999).
92. Martin, L. B. et al. Status of paratyphoid fever vaccine research and development. Vaccine 34, 2900–2902 (2016).
93. Urdea, M. et al. Requirements for high impact diagnostics in the developing world. Nature 444, 73–79 (2006).
70
94. Andrews, J. R. & Ryan, E. T. Diagnostics for invasive Salmonella infections: Current challenges and future directions. Vaccine 33, C8–C15 (2015).
95. Mabey, D., Peeling, R. W., Ustianowski, A. & Perkins, M. D. Diagnostics for the developing world. Nat. Rev. Microbiol. 2, 231–240 (2004).
96. Girosi, F. et al. Developing and interpreting models to improve diagnostics in developing countries. Nature 444, 3–8 (2006).
97. Peeling, R. W. & Mabey, D. Point-of-care tests for diagnosing infections in the developing world. Clin. Microbiol. Infect. 16, 1062–1069 (2010).
98. Chan, C. P. Y. et al. Evidence-based point-of-care diagnostics: current status and emerging technologies. Annu. Rev. Anal. Chem. Palo Alto Calif 6, 191–211 (2013).
99. Cohen, J. et al. Feasibility of Distributing Rapid Diagnostic Tests for Malaria in the Retail Sector: Evidence from an Implementation Study in Uganda. PLoS ONE 7, (2012).
100. MacFadden, D. R., Bogoch, I. I. & Andrews, J. R. Advances in diagnosis, treatment, and prevention of invasive Salmonella infections: Curr. Opin. Infect. Dis. 29, 453–458 (2016).
101. Thielman, N. M. & Guerrant, R. L. Enteric Fever and Other Causes of Abdominal Symptoms with Fever. in Mandell, Douglas, and Bennett´s Principles and Practice of Infectious Diseases 1215–1285 (Elsevier/Churchill Livingstone, 2005).
102. Gilman, R. H., Terminel, M., Levine, M. M., Hernandez-Mendoza, P. & Hornick, R. B. Relative efficacy of blood, urine, rectal swab, bone-marrow, and rose-spot cultures for recovery of Salmonella typhi in typhoid fever. The Lancet 1, 1211–1213 (1975).
103. Peacock, S. J. & Newton, P. N. Public health impact of establishing the cause of bacterial infections in rural Asia. Trans. R. Soc. Trop. Med. Hyg. 102, 5–6 (2008).
104. Olopoenia, L. & King, A. Widal agglutination test − 100 years later: still plagued by controversy. Postgrad. Med. J. 76, 80–84 (2000).
105. Ley, B. et al. Evaluation of the Widal tube agglutination test for the diagnosis of typhoid fever among children admitted to a rural hdospital in Tanzania and a comparison with previous studies. BMC Infect. Dis. 10, 180 (2010).
106. Wijedoru, L., Mallett, S. & Parry, C. M. Rapid diagnostic tests for typhoid and paratyphoid (enteric) fever. Cochrane Database Syst. Rev. (2017). doi:10.1002/14651858.CD008892.pub2
107. Chaudhry, R. et al. Rapid diagnosis of typhoid fever by an in-house flagellin PCR. J. Med. Microbiol. 59, 1391–1393 (2010).
108. Nandagopal, B. et al. Prevalence of Salmonella typhi among patients with febrile illness in rural and peri-urban populations of Vellore district, as determined by nested PCR targeting the flagellin gene. Mol. Diagn. Ther. 14, 107–112 (2010).
71
109. Bokkenheuser, V. Detection of Typhoid Carriers. Am. J. Public Health Nations Health 54, 477–486 (1964).
110. Gupta, A. et al. Evaluation of community-based serologic screening for identification of chronic Salmonella Typhi carriers in Vietnam. Int. J. Infect. Dis. 10, 309–314 (2006).
111. Fiehn, O. et al. Metabolite profiling for plant functional genomics. Nat. Biotechnol. 18, 1157–1161 (2000).
112. Nicholson, J. K., Lindon, J. C. & Holmes, E. Metabonomics: understanding the metabolic responses of living systems to pathophysiological stimuli via multivariate statistical analysis of biological NMR spectroscopic data. Xenobiotica 29, 1181–1189 (1999).
113. Dalgliesh, C. E., Horning, E. C., Horning, M. G., Knox, K. L. & Yarger, K. A gas–liquid-chromatographic procedure for separating a wide range of metabolites occuring in urine or tissue extracts. Biochem. J. 101, 792–810 (1966).
114. Fiehn, O. Metabolomics–the link between genotypes and phenotypes. Plant Mol. Biol. 48, 155–171 (2002).
115. B. Dunn, W., I. Broadhurst, D., J. Atherton, H., Goodacre, R. & L. Griffin, J. Systems level studies of mammalian metabolomes: the roles of mass spectrometry and nuclear magnetic resonance spectroscopy. Chem. Soc. Rev. 40, 387–426 (2011).
116. Hyötyläinen, T. Novel methodologies in metabolic profiling with a focus on molecular diagnostic applications. Expert Rev. Mol. Diagn. 12, 527–538 (2012).
117. Theodoridis, G., Gika, H. G. & Wilson, I. D. Mass spectrometry-based holistic analytical approaches for metabolite profiling in systems biology studies. Mass Spectrom. Rev. 30, 884–906 (2011).
118. Nicholson, J. K. & Lindon, J. C. Systems biology: metabonomics. Nature 455, 1054–1056 (2008).
119. Kell, D. B. & Oliver, S. G. The metabolome 18 years on: a concept comes of age. Metabolomics 12, (2016).
120. Riekeberg, E. & Powers, R. New frontiers in metabolomics: from measurement to insight. F1000Research 6, (2017).
121. Biomarkers and surrogate endpoints: Preferred definitions and conceptual framework. Clin. Pharmacol. Ther. 69, 89–95 (2001).
122. Nordström, A. & Lewensohn, R. Metabolomics: Moving to the Clinic. J. Neuroimmune Pharmacol. 5, 4–17 (2009).
123. Koen, N., Du Preez, I. & Loots, D. T. Metabolomics and Personalized Medicine. Adv. Protein Chem. Struct. Biol. 102, 53–78 (2016).
124. Xia, J., Broadhurst, D. I., Wilson, M. & Wishart, D. S. Translational biomarker discovery in clinical metabolomics: an introductory tutorial. Metabolomics 9, 280–299 (2012).
72
125. Mamas, M., Dunn, W. B., Neyses, L. & Goodacre, R. The role of metabolites and metabolomics in clinically applicable biomarkers of disease. Arch. Toxicol. 85, 5–17 (2010).
126. Serkova, N. J., Standiford, T. J. & Stringer, K. A. The Emerging Field of Quantitative Blood Metabolomics for Biomarker Discovery in Critical Illnesses. Am. J. Respir. Crit. Care Med. 184, 647–655 (2011).
127. Voge, N. V. et al. Metabolomics-Based Discovery of Small Molecule Biomarkers in Serum Associated with Dengue Virus Infections and Disease Outcomes. PLoS Negl. Trop. Dis. 10, (2016).
128. Cui, L. et al. Serum Metabolome and Lipidome Changes in Adult Patients with Primary Dengue Infection. PLoS Negl Trop Dis 7, e2373 (2013).
129. Surowiec, I. et al. Metabolic Signature Profiling as a Diagnostic and Prognostic Tool in Pediatric Plasmodium falciparum Malaria. Open Forum Infect. Dis. 2, (2015).
130. Sengupta, A. et al. Global host metabolic response to Plasmodium vivax infection: a 1H NMR based urinary metabonomic study. Malar. J. 10, 384 (2011).
131. Tritten, L. et al. Metabolic Profiling Framework for Discovery of Candidate Diagnostic Markers of Malaria. Sci. Rep. 3, (2013).
132. Mickiewicz, B., Vogel, H. J., Wong, H. R. & Winston, B. W. Metabolomics as a Novel Approach for Early Diagnosis of Pediatric Septic Shock and its Mortality. Am. J. Respir. Crit. Care Med. (2013). doi:10.1164/rccm.201209-1726OC
133. Langley, R. J. et al. An Integrated Clinico-Metabolomic Model Improves Prediction of Death in Sepsis. Sci. Transl. Med. 5, 195ra95-195ra95 (2013).
134. Kamisoglu, K. et al. Temporal Metabolic Profiling of Plasma During Endotoxemia in Humans: Shock 40, 519–526 (2013).
135. Kamisoglu, K. et al. Human metabolic response to systemic inflammation: assessment of the concordance between experimental endotoxemia and clinical cases of sepsis/SIRS. Crit. Care 19, 71 (2015).
136. Kim, Y.-M. et al. Salmonella modulates metabolism during growth under conditions that induce expression of virulence genes. Mol. Biosyst. (2013). doi:10.1039/c3mb25598k
137. Antunes, L. C. M. et al. Metabolomics Reveals Phospholipids as Important Nutrient Sources during Salmonella Growth in Bile In Vitro and In Vivo. J. Bacteriol. 193, 4719–4725 (2011).
138. Antunes, L. C. M. et al. Impact of Salmonella Infection on Host Hormone Metabolism Revealed by Metabolomics. Infect. Immun. 79, 1759–1769 (2011).
139. Thompson, L. J. et al. Transcriptional response in the peripheral blood of patients infected with Salmonella enterica serovar Typhi. Proc. Natl. Acad. Sci. U. S. A. 106, 22433–22438 (2009).
73
140. Kumar, A., Singh, S., Ahirwar, S. K. & Nath, G. Proteomics-based identification of plasma proteins and their association with the host–pathogen interaction in chronic typhoid carriers. Int. J. Infect. Dis. 19, 59–66 (2014).
141. Pocock, G., Richards, C. D. & Richards, D. A. Human Physiology. (Oxford University Press, 2013).
142. Teahan, O. et al. Impact of Analytical Bias in Metabonomic Studies of Human Blood Serum and Plasma. Anal. Chem. 78, 4307–4318 (2006).
143. Wedge, D. C. et al. Is Serum or Plasma More Appropriate for Intersubject Comparisons in Metabolomic Studies? An Assessment in Patients with Small-Cell Lung Cancer. Anal. Chem. 83, 6689–6697 (2011).
144. Liu, L. et al. Differences in metabolite profile between blood plasma and serum. Anal. Biochem. 406, 105–112 (2010).
145. Bouatra, S. et al. The Human Urine Metabolome. PLoS ONE 8, e73076 (2013).
146. Psychogios, N. et al. The Human Serum Metabolome. PLoS One 6, e16957 (2011).
147. Zhang, A., Sun, H., Wu, X. & Wang, X. Urine metabolomics. Clin. Chim. Acta 414, 65–69 (2012).
148. Want, E. J. et al. Global metabolic profiling procedures for urine using UPLC–MS. Nat. Protoc. 5, 1005–1018 (2010).
149. A, J. et al. Extraction and GC/MS Analysis of the Human Blood Plasma Metabolome. Anal. Chem. 77, 8086–8094 (2005).
150. Zaikin, V. & Halket, J. M. A Handbook of Derivatives for Mass Spectrometry. (IM Publications, 2009).
151. Halket, J. M. et al. Chemical derivatization and mass spectral libraries in metabolic profiling by GC/MS and LC/MS/MS. J. Exp. Bot. 56, 219–243 (2005).
152. Koek, M. M., Muilwijk, B., van der Werf, M. J. & Hankemeier, T. Microbial Metabolomics with Gas Chromatography/Mass Spectrometry. Anal. Chem. 78, 1272–1281 (2006).
153. Sparkman, O. D., Penton, Z. E. & Kitson, F. G. Gas Chromatography and Mass Spectrometry. A practical guide. (Elsevier, 2011).
154. Mondello, L., Tranchida, P. Q., Dugo, P. & Dugo, G. Comprehensive two-dimensional gas chromatography-mass spectrometry: A review. Mass Spectrom. Rev. 27, 101–124 (2008).
155. Almstetter, M. F., Oefner, P. J. & Dettmer, K. Comprehensive two-dimensional gas chromatography in metabolomics. Anal. Bioanal. Chem. 402, 1993–2013 (2012).
156. Kouremenos, K. A., Pitt, J. & Marriott, P. J. Metabolic profiling of infant urine using comprehensive two-dimensional gas chromatography: Application to the diagnosis of organic acidurias and biomarker discovery. J. Chromatogr. A 1217, 104–111 (2010).
74
157. Bertsch, W. Two-Dimensional Gas Chromatography. Concepts, Instrumentation, and Applications–Part 2: Comprehensive Two-Dimensional Gas Chromatography. J. High Resolut. Chromatogr. 23, 167–181 (2000).
158. Górecki, T., Harynuk, J. & Panić, O. The evolution of comprehensive two-dimensional gas chromatography (GC×GC). J. Sep. Sci. 27, 359–379 (2004).
159. Harris, D. C. Quantitative Chemical Analysis. (W. H. Freeman and Company, 2007).
160. de Hoffmann, E. & Stroobant, V. Mass Spectrometry. Principles and Applications. (John Wiley & Sons Ltd, 2007).
161. Fiehn, O. Extending the breadth of metabolite profiling by gas chromatography coupled to mass spectrometry. Trends Anal. Chem. TRAC 27, 261–269 (2008).
162. Pasikanti, K. K., Ho, P. C. & Chan, E. C. Y. Development and validation of a gas chromatography/mass spectrometry metabonomic platform for the global profiling of urinary metabolites. Rapid Commun. Mass Spectrom. 22, 2984–2992 (2008).
163. Li, X. et al. Comprehensive two-dimensional gas chromatography/time-of-flight mass spectrometry for metabonomics: Biomarker discovery for diabetes mellitus. Anal. Chim. Acta 633, 257–262 (2009).
164. Welthagen, W. et al. Comprehensive two-dimensional gas chromatography–time-of-flight mass spectrometry (GC × GC-TOF) for high resolution metabolomics: biomarker discovery on spleen tissue extracts of obese NZO compared to lean C57BL/6 mice. Metabolomics 1, 65–73 (2005).
165. Dunn, W. B. et al. Closed-Loop, Multiobjective Optimization of Two-Dimensional Gas Chromatography/Mass Spectrometry for Serum Metabolomics. Anal. Chem. 79, 464–476 (2007).
166. Almstetter, M. F. et al. Integrative Normalization and Comparative Analysis for Metabolic Fingerprinting by Comprehensive Two-Dimensional Gas Chromatography−Time-of-Flight Mass Spectrometry. Anal. Chem. 81, 5731–5739 (2009).
167. Castillo, S., Mattila, I., Miettinen, J., Oresic, M. & Hyötyläinen, T. Data Analysis Tool for Comprehensive Two-Dimensional Gas Chromatography/Time-of-Flight Mass Spectrometry. Anal. Chem. 83, 3058–3067 (2011).
168. Gika, H. G., Theodoridis, G. A., Plumb, R. S. & Wilson, I. D. Current practice of liquid chromatography–mass spectrometry in metabolomics and metabonomics. J. Pharm. Biomed. Anal. 87, 12–25 (2014).
169. Dunn, W. B. et al. Procedures for large-scale metabolic profiling of serum and plasma using gas chromatography and liquid chromatography coupled to mass spectrometry. Nat. Protoc. 6, 1060–1083 (2011).
75
170. Wilson, I. D. et al. HPLC-MS-based methods for the study of metabonomics. J. Chromatogr. B 817, 67–76 (2005).
171. Jonsson, P. et al. High-Throughput Data Analysis for Detecting and Identifying Differences between Samples in GC/MS-Based Metabolomic Analyses. Anal. Chem. 77, 5635–5642 (2005).
172. Jonsson, P. Multivariate processing and modelling of hyphenated metabolite data. (2005).
173. Sumner, L. W. et al. Proposed minimum reporting standards for chemical analysis. Metabolomics 3, 211–221 (2007).
174. Werner, E. et al. Mass spectrometry for the identification of the discriminating signals from metabolomics: Current status and future trends. J. Chromatogr. B 871, 143–163 (2008).
175. Dettmer, K., Aronov, P. A. & Hammock, B. D. Mass spectrometry-based metabolomics. Mass Spectrom. Rev. 26, 51–78 (2007).
176. Trygg, J., Holmes, E. & Lundstedt, T. Chemometrics in Metabonomics. J. Proteome Res. 6, 469–479 (2007).
177. Eriksson, L. et al. Multi- and Megavariate Data Analysis. Part I. Basic Prinicples and Applications. (Umetrics AB, 2006).
178. Wold, S., Esbensen, K. & Geladi, P. Principal component analysis. Chemom. Intell. Lab. Syst. 2, 37–52 (1987).
179. Wold, H. Nonlinear estimation by iterative least square procedures. (Wiley, 1966).
180. Wold, S., Sjöström, M. & Eriksson, L. PLS-regression: a basic tool of chemometrics. Chemom. Intell. Lab. Syst. 58, 109–130 (2001).
181. Trygg, J. & Wold, S. Orthogonal projections to latent structures (O-PLS). J. Chemom. 16, 119–128 (2002).
182. Bylesjö, M. et al. OPLS discriminant analysis: combining the strengths of PLS-DA and SIMCA classification. J. Chemom. 20, 341–351 (2006).
183. Wold, S. Cross-Validatory Estimation of the Number of Components in Factor and Principal Components Models. Technometrics 20, 397 (1978).
184. Westerhuis, J. A. et al. Assessment of PLSDA cross validation. Metabolomics 4, 81–89 (2008).
185. Eriksson, L., Trygg, J. & Wold, S. CV-ANOVA for significance testing of PLS and OPLS® models. J. Chemom. 22, 594–600 (2008).
186. Zweig, M. H. & Campbell, G. Receiver-operating characteristic (ROC) plots: a fundamental evaluation tool in clinical medicine. Clin. Chem. 39, 561–577 (1993).
187. Bamber, D. The area above the ordinal dominance graph and the area below the receiver operating characteristic graph. J. Math. Psychol. 12, 387–415 (1975).
188. Gromski, P. S. et al. A comparative investigation of modern feature selection and classification approaches for the analysis of mass spectrometry data. Anal. Chim. Acta 829, 1–8 (2014).
76
189. Galindo-Prieto, B., Eriksson, L. & Trygg, J. Variable influence on projection (VIP) for orthogonal projections to latent structures (OPLS). J. Chemom. 28, 623–632 (2014).
190. Andersen, C. M. & Bro, R. Variable selection in regression—a tutorial. J. Chemom. 24, 728–737 (2010).
191. Rajalahti, T. et al. Discriminating Variable Test and Selectivity Ratio Plot: Quantitative Tools for Interpretation and Variable (Biomarker) Selection in Complex Spectral or Chromatographic Profiles. Anal. Chem. 81, 2581–2590 (2009).
192. Saccenti, E., Hoefsloot, H. C. J., Smilde, A. K., Westerhuis, J. A. & Hendriks, M. M. W. B. Reflections on univariate and multivariate analysis of metabolomics data. Metabolomics 10, 361–374 (2014).
193. Thompson, C. N. et al. Undifferentiated Febrile Illness in Kathmandu, Nepal. Am. J. Trop. Med. Hyg. 92, 875–878 (2015).
194. Roumagnac, P. et al. Evolutionary history of Salmonella typhi. Science 314, 1301–1304 (2006).
195. Gonzalez-Escobedo, G., Marshall, J. M. & Gunn, J. S. Chronic and acute infection of the gall bladder by Salmonella Typhi: understanding the carrier state. Nat. Rev. Microbiol. 9, 9–14 (2010).
196. Gopinath, S., Carden, S. & Monack, D. Shedding light on Salmonella carriers. Trends Microbiol. 20, 320–327 (2012).
197. Jansen, A. M. et al. A Salmonella Typhimurium-Typhi Genomic Chimera: A Model to Study Vi Polysaccharide Capsule Function In Vivo. PLoS Pathog. 7, e1002131 (2011).
198. Beisel, W. R. Metabolic Response to Infection. Annu. Rev. Med. 26, 9–20 (1975).