CamGrid: High Throughput Computing in Science [email protected] Dr David Burke Antigenic Cartography...
-
date post
24-Jan-2016 -
Category
Documents
-
view
223 -
download
0
Transcript of CamGrid: High Throughput Computing in Science [email protected] Dr David Burke Antigenic Cartography...
![Page 1: CamGrid: High Throughput Computing in Science dfb21@cam.ac.uk Dr David Burke Antigenic Cartography Group Department of Zoology University of Cambridge.](https://reader034.fdocuments.us/reader034/viewer/2022050909/56649d5e5503460f94a3e136/html5/thumbnails/1.jpg)
CamGrid: High Throughput Computing in Science
Dr David BurkeAntigenic Cartography Group
Department of ZoologyUniversity of Cambridge
25th June 2008
Modelling the evolution of the influenza virus
![Page 2: CamGrid: High Throughput Computing in Science dfb21@cam.ac.uk Dr David Burke Antigenic Cartography Group Department of Zoology University of Cambridge.](https://reader034.fdocuments.us/reader034/viewer/2022050909/56649d5e5503460f94a3e136/html5/thumbnails/2.jpg)
Antigenic variation of viruses
Antigenically Stable Pathogens Antigenically Variable Pathogens
Smallpox
Measles
Tuberculosis
Mumps
Tetanus
Influenza Virus
Malaria
HIV
Dengue
![Page 3: CamGrid: High Throughput Computing in Science dfb21@cam.ac.uk Dr David Burke Antigenic Cartography Group Department of Zoology University of Cambridge.](https://reader034.fdocuments.us/reader034/viewer/2022050909/56649d5e5503460f94a3e136/html5/thumbnails/3.jpg)
The Influenza Virus
Annually, 'flu infects 7-14% of the population (400-800 million people globally ) Virus genome contains 8 RNA segments which code 11 proteins
RNA polymerase makes a single nucleotide error roughly every 10 thousand nucleotides Nearly every new influenza virus has multiple mutations
![Page 4: CamGrid: High Throughput Computing in Science dfb21@cam.ac.uk Dr David Burke Antigenic Cartography Group Department of Zoology University of Cambridge.](https://reader034.fdocuments.us/reader034/viewer/2022050909/56649d5e5503460f94a3e136/html5/thumbnails/4.jpg)
Hemagglutinin (HA) is found on the surface of the influenza viruses.
There are ~500 HA copies per virus
It is responsible for binding the virus, to the cell that is being infected, via sugars (sialic acid) on the surface of the cells.
Haemagglutinin
There are at least 16 different HA antigens.
These subtypes are labelled H1 through H16.
Only the first three hemagglutinins, H1, H2, and H3, are found in human influenza viruses.
![Page 5: CamGrid: High Throughput Computing in Science dfb21@cam.ac.uk Dr David Burke Antigenic Cartography Group Department of Zoology University of Cambridge.](https://reader034.fdocuments.us/reader034/viewer/2022050909/56649d5e5503460f94a3e136/html5/thumbnails/5.jpg)
Haemagglutinin-Antibody Complex
HA is the major target for an individuals antigenic response
Over time, mutations build up and
antibodies lose the ability to bind.
For this reason, the 'flu vaccine has
had to be updated more than 20 times
over the last 40 years.
![Page 6: CamGrid: High Throughput Computing in Science dfb21@cam.ac.uk Dr David Burke Antigenic Cartography Group Department of Zoology University of Cambridge.](https://reader034.fdocuments.us/reader034/viewer/2022050909/56649d5e5503460f94a3e136/html5/thumbnails/6.jpg)
Nine subtypes of influenza neuraminidase are known.
Subtypes N1 and N2 have been linked to epidemics in man
This is the target for several drugs (tamiflu, relenza)
Neuraminidase cleaves terminal sialic acid residues from carbohydrate moieties on the surfaces of infected cells. This promotes the release of viruses from the cells.
Neuraminidase
Influenza strains are classified according their HA/NA subtypes ie H3N2
There are 100 molecules of neuraminidase per virion
![Page 7: CamGrid: High Throughput Computing in Science dfb21@cam.ac.uk Dr David Burke Antigenic Cartography Group Department of Zoology University of Cambridge.](https://reader034.fdocuments.us/reader034/viewer/2022050909/56649d5e5503460f94a3e136/html5/thumbnails/7.jpg)
Influenza virus: pandemic and epidemic
Spanish flu
1918
Asian flu19
57Hong Kong flu
1968
40 million deaths 1-4 million deaths 1 million deaths
H1N1 H2N2 H3N2
2008
5-150 million??
H5N1?
![Page 8: CamGrid: High Throughput Computing in Science dfb21@cam.ac.uk Dr David Burke Antigenic Cartography Group Department of Zoology University of Cambridge.](https://reader034.fdocuments.us/reader034/viewer/2022050909/56649d5e5503460f94a3e136/html5/thumbnails/8.jpg)
Two dimensional Mostly linear Forms Clusters Chronologically ordered Approx equal time between
clusters Approx equal distance between
clustersThese maps are now routinely
used for selection of strains for 'flu vaccine
Features of “antigenic map” of Influenza H3N2 1968-2003
1968
1972
1975
19791987
1989
1992
1995
1997
2002
1977
![Page 9: CamGrid: High Throughput Computing in Science dfb21@cam.ac.uk Dr David Burke Antigenic Cartography Group Department of Zoology University of Cambridge.](https://reader034.fdocuments.us/reader034/viewer/2022050909/56649d5e5503460f94a3e136/html5/thumbnails/9.jpg)
Why are there distinct
clusters and not slow
progression?
What is the mechanism of
large antigenic changes
Why does ‘flu not evolve
faster?
Questions1968
1972
1975
19791987
1989
1992
1995
1997
2002
1977
![Page 10: CamGrid: High Throughput Computing in Science dfb21@cam.ac.uk Dr David Burke Antigenic Cartography Group Department of Zoology University of Cambridge.](https://reader034.fdocuments.us/reader034/viewer/2022050909/56649d5e5503460f94a3e136/html5/thumbnails/10.jpg)
5-N-acetylneuraminic acid-alpha 2,6-galactose
5-N-acetylneuraminic acid alpha- 2,3-galactose
oseltamivir
(tamiflu)
Determinants of ligand specificity for HA and NA
Human & Pig adapted influenza viruses
Avian, Equine & Pig adapted influenza viruses
Neuraminidase Inhibitor
![Page 11: CamGrid: High Throughput Computing in Science dfb21@cam.ac.uk Dr David Burke Antigenic Cartography Group Department of Zoology University of Cambridge.](https://reader034.fdocuments.us/reader034/viewer/2022050909/56649d5e5503460f94a3e136/html5/thumbnails/11.jpg)
Sialic acid binding to haemagglutinin
How will strain variation change the affinity and specificity of sialic acid binding?
gal--2-3-sia gal--2-6-sia
![Page 12: CamGrid: High Throughput Computing in Science dfb21@cam.ac.uk Dr David Burke Antigenic Cartography Group Department of Zoology University of Cambridge.](https://reader034.fdocuments.us/reader034/viewer/2022050909/56649d5e5503460f94a3e136/html5/thumbnails/12.jpg)
Oseltamivir (tamiflu) bound to neuraminidase
How will strain variation (amino acid changes) affect the specificity for
sialic acid and other inhibitor binding?
![Page 13: CamGrid: High Throughput Computing in Science dfb21@cam.ac.uk Dr David Burke Antigenic Cartography Group Department of Zoology University of Cambridge.](https://reader034.fdocuments.us/reader034/viewer/2022050909/56649d5e5503460f94a3e136/html5/thumbnails/13.jpg)
Structure Prediction
Comparative modellingBased on xray structure of a strain of HA from 1968
Molecular Dynamics Monte Carlo simulations
Which features of the protein structure change as the virus
evolves?
Can we quantify the antigenic change given the amino acid substitutions and subsequent structure prediction
In silico predictions of the structure of the virus
![Page 14: CamGrid: High Throughput Computing in Science dfb21@cam.ac.uk Dr David Burke Antigenic Cartography Group Department of Zoology University of Cambridge.](https://reader034.fdocuments.us/reader034/viewer/2022050909/56649d5e5503460f94a3e136/html5/thumbnails/14.jpg)
Multiple strains>300 HA strains >100 NA strains
Multiple simulation conditions
Use of CamGrid resource
Both MD and MC methods are computationally expensive
Each simulation takes >5 days single cpu
Total simulations to date 222,000 cpu hrs = 25.3 CPU years
This is only made possible by CamGrid
![Page 15: CamGrid: High Throughput Computing in Science dfb21@cam.ac.uk Dr David Burke Antigenic Cartography Group Department of Zoology University of Cambridge.](https://reader034.fdocuments.us/reader034/viewer/2022050909/56649d5e5503460f94a3e136/html5/thumbnails/15.jpg)
HK68(1968) EN72(1972)
Comparison of Trimer structures
![Page 16: CamGrid: High Throughput Computing in Science dfb21@cam.ac.uk Dr David Burke Antigenic Cartography Group Department of Zoology University of Cambridge.](https://reader034.fdocuments.us/reader034/viewer/2022050909/56649d5e5503460f94a3e136/html5/thumbnails/16.jpg)
1968
1972
1975
1979
1987
1989
1992
1995
1997
2002
1977
![Page 17: CamGrid: High Throughput Computing in Science dfb21@cam.ac.uk Dr David Burke Antigenic Cartography Group Department of Zoology University of Cambridge.](https://reader034.fdocuments.us/reader034/viewer/2022050909/56649d5e5503460f94a3e136/html5/thumbnails/17.jpg)
This is the technique of using a GPU, which typically handles computation only for computer graphics, to perform computation in applications traditionally handled by the CPU.
A GPU is actually 100s of individual processors.
GPGPU is made possible by the addition of code which allows software developers to use the graphics card for non-graphics data.
Usually this requires a high level of programming
General-purpose computing on graphics processing units (GPGPU)
Contains standard numerical libraries for FFT (Fast Fourier Transform) and BLAS (Basic Linear Algebra Subroutines)
Support for Linux 32/64-bit and Windows XP 32/64-bit operating systems
NVIDIA CUDA™ is a C language environment for application development on the GPU
![Page 18: CamGrid: High Throughput Computing in Science dfb21@cam.ac.uk Dr David Burke Antigenic Cartography Group Department of Zoology University of Cambridge.](https://reader034.fdocuments.us/reader034/viewer/2022050909/56649d5e5503460f94a3e136/html5/thumbnails/18.jpg)
Accelerating Molecular Modelling Applications with Graphics Processors
Folding@Home use molecular dynamics to fold proteins in silico. Since 2006, their code uses GPUs from ATI
X1900 class of graphics cards as well as the new Cell processor in Sony's PlayStation 3.
John Stone and colleagues (J Comput Chem 28: 2618–2640) rewrote NAMD to use CUDA on a NVIDIA 8800GTX card (128 processor cores).
They produced a 5X increase in speed reaching 269 GFLOPS performance.
The 2.6-GHz Intel quad core CPU reached 5.3 GFLOPS
![Page 19: CamGrid: High Throughput Computing in Science dfb21@cam.ac.uk Dr David Burke Antigenic Cartography Group Department of Zoology University of Cambridge.](https://reader034.fdocuments.us/reader034/viewer/2022050909/56649d5e5503460f94a3e136/html5/thumbnails/19.jpg)
The future of CamGrid?
Nvidia have released Tesla, a version specifically for GPGPU,which has no graphics output
Tesla cards have up to 240 cores per processor
Tesla C1060 has 1 GPU achieving ~1000 GFLOPS of processing
power
Tesla S1060 (1U rack) has 4 GPU reaching ~4000 GFLOPS
An 8 GPU version of the Tesla S870 is planned for the future
![Page 20: CamGrid: High Throughput Computing in Science dfb21@cam.ac.uk Dr David Burke Antigenic Cartography Group Department of Zoology University of Cambridge.](https://reader034.fdocuments.us/reader034/viewer/2022050909/56649d5e5503460f94a3e136/html5/thumbnails/20.jpg)
Erasmus Medical CentreRon FouchierJan de JongBjorn KoelVincent MunsterGuus RimmelzwaanWalter BeyerTheo BestebroerRuud van BeekAb Osterhaus
Santa Fe Institute Alan Lapedes
University of Cambridge Derek Smith David Burke Terry Jones Colin Russell Nicola Lewis (& AHT) Dan Horton (& VLA) Ana Mosterin Eugene Skepner Yan Wong (& Leeds) Margaret Mackinnon (& KEMRI) David Wales (Chemistry) Chris Whittleston (Chemistry) Birgit Strodel (Chemistry) Mike Payne (Cavendish labs) Sebastian Ahnert (Cavendish labs)
CamGrid sys admins
WHO global influenza surveillance CDC: Nancy Cox, Sasha Klimov, Michael Shaw MELB: Ian Gust, Ian Barr, Aeron Hurt, Alan Hampson NIMR: Alan Hay, Y-P Lin, Vicky Gregory NIID: Tashiro Masato, Takato Odagiri WHO: Wenqing Zhang, Klaus Stohr NICs: Critical and enormously valuable
Funding
NIH Director’s Pioneer Award, Fogarty International Center, HFSP, IFPMA, CIDC, EU Framework 5 Novaflu, EU Framework 6 Virgil