Improving Viral Manufacture through Advanced PAT · 2019-11-28 · Improving Viral Manufacture...

28
Improving Viral Manufacture through Advanced PAT Dr Michael Delahaye Lead Scientist – Industrialisation AMC October 2019

Transcript of Improving Viral Manufacture through Advanced PAT · 2019-11-28 · Improving Viral Manufacture...

  • Improving Viral Manufacture through Advanced PAT

    Dr Michael DelahayeLead Scientist – Industrialisation

    AMC October 2019

  • The Cell and Gene Therapy Catapult

    £70m Development Facility

    • 1,200m2 Custom designed cell and gene therapy development facility

    • Prime location in the heart of the London clinical research cluster

    • >200 permanent staff

    £55m large scale manufacture center

    • 7,200m2 manufacturing centre designed specifically for cell and gene therapies

    • Located in the Stevenage biocatalyst

    • 12 independent manufacturing modules

  • Artificial Intelligence

    3

    ARTIFICIALIntelligence

    Predictive analytics

    Deep learning

    Digital twins

    Machine Learning

    Translation

    Data extraction

    Natural Language processing

    Image recognition

    Machine vision

    Diagnostics

    Autonomousmachines

    Planning & optimisation

  • AI application – cell therapy

    4

    In 2016 Google Deepmind health began collaborating with Moorfields eye hospital

    Develop an AI platform which can analyse OCT images and diagnose the early stages of 50 eye disease and recommend treatment

    Has a 94% accuracy rate (comparable to clinician with 20 years experience)

    Also explains how it arrives at it recommendations to clinicians

    In 2018 the FDA approved the marketing of the first medical device that uses artificial intelligence to detect diabetic retinopathy (DR) in adults with diabetes

    Uses a deep learning system to analyse images and identify DR with an 89.5% accuracy.

    Microsoft in collaboration with MIT have developed 2 systems which use machine learning to design guide RNA sequences for CRISPR based gene editing.

    Elevation – predicts off target effects and provides a score for how likely a guide is to disrupt the cell

    Azimuth - predicts which part of a gene to target for optimal KO

  • AI and big data

    5

    • AI works best when large amounts of rich, big data are available. The more facets the data covers, the faster the algorithms can learn and fine-tune their predictive analyses.

    • Big Data is the raw input – it needs to be cleaned, structured and integrated before it becomes useful,

    • Artificial intelligence is the output, the intelligence that results from the processed data.

    • To solve a specific set of problems, AI needs to be trained properly in order to build intelligence and allow systems to be validated

    • IDx – 495,000 images to train and validate the AI system for identifying diabetic retinopathy

  • Data generation – biopharmaceutical production

    6

    Upstream FormulationDownstream

    Process and material insightsActionable data analytics>500 QC entries

    >2000 batch record entries

    >500 million continuous data point

    Upstream FormulationDownstream

  • Data driven processing

  • Data driven decision making

    Bus

    ines

    s V

    alue

    Complexity

    Hindsight

    InsightForesight

    Descriptive AnalyticsWhat happened?

    Diagnostic AnalyticsWhy did it happen?

    Predictive AnalyticsWhat will happen?

    Prescriptive AnalyticsHow can we make it

    happen?

  • Data driven gene therapy bioprocessing

    9

    • Temp• pH• DO• Raman spectroscopy• RI spectroscopy

    Multivariateanalysis

    100’s samples

    • 2 production processes• 12 production runs

    • ~10 million data point

    • Metabolomics• Transcriptomics

  • Cell metabolism

    10

    Glycine

    Cysteinyl disulfide

    Cysteinyl glycine

    Cysteinyl disulphide

    Cysteinyl glycine disulfide

    Cystathionine

    Cysteine

    Serine

    Isoleucine

    .γ -glutamyl serine

    Cysteine sulfinic acid

    Lipoic acidGlucose

    Pyruvate

    Acetyl CoA

    Mevalonate N-butyryl isoleucine

    Cystathionine

    Citrate

    .α -ketoglutarate

    SuccinateFumarate

    Malate

    .α -lipoate

    TCA CycleCholosterol

    N-aspartyl glutamic acid

    N-acetyl aspartate

    N-acetyl glutamate

    N-acetyl glutamine

    Histidine N-acetyl histidine

    4-imadazoleacetate

    Imadizole lactate

    .γ -glutamyl glutamine

    .α -ketoglutaramate

    3-hydroxyisobutyrate

    N-acetyl valine

    .γ -glutamyl methionine

    Xanthine Purine Metabolism

    Aspartate

    Proline

    Glutamate

    Aconitate

    Glutamine

    Valine

    Methionine

    Betaine

    Homocystine

    .γ -glutamyl valine

    Alanine

    Lactate

    Isoleucine

    Trytophan

    lysine

    Leucine

    Carboxyethyltyrosine

    Tyrosine

    Thymine

    Thymidine

    DNA Synthesis

    Petrin

    Phenyl-pyruvate

    4-hydroxyphenylpyruvate

    1-carboxyethylphenylalanineArginosuccinate

    Phenylalanine

    Aspartate

    Asparagine

    2-aminoadipate

    Methyl-lysine

    Aminoadipic acid

    Pipecolic acid

    N-butyryl-leucine

    .γ -glutamyl Isoleucine

    N-carbomylalanine

    Cholesterol

    Glycerophosphoserine

    Spermidine

    Membrane formation

    Glycerophosphocholine

    Phosphocholine

    Choline

    Phosphate

    Phosphoethanolamine

    N-acetylputrescine

    Putrescine

    Cadaverine

    N-acetyl-cadaverine

    Process 1

    Process 2

  • Supervised learning: model performance and optimisation

    11

    All data (144 samples, 300 metabolites)

    Training (70%) Test (30%)

    Model performance

    𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴 =𝑁𝑁𝑁𝑁. 𝑁𝑁𝑜𝑜 𝐴𝐴𝑁𝑁𝐴𝐴𝐴𝐴𝑐𝑐𝐴𝐴𝑐𝑐 𝑝𝑝𝐴𝐴𝑐𝑐𝑝𝑝𝑝𝑝𝐴𝐴𝑐𝑐𝑝𝑝𝑁𝑁𝑝𝑝𝑝𝑝𝑇𝑇𝑁𝑁𝑐𝑐𝐴𝐴𝑇𝑇 𝑝𝑝𝑁𝑁. 𝑁𝑁𝑜𝑜 𝑝𝑝𝐴𝐴𝑐𝑐𝑝𝑝𝑝𝑝𝐴𝐴𝑐𝑐𝑝𝑝𝑁𝑁𝑝𝑝𝑝𝑝

    • Model performance measured on “test” data to assess ability to generalise (stop overfitting)

    • Model parameters estimated using “training” data

    x10

    Model optimisation usingk-fold cross validation Parameter Optimisation

    • Model parameter optimisation carried out using repeated k-fold cross validation

    • Provides most mileage on training data• Accuracy averaged across 3 repeats and

    10-folds

    Cross Validation - Training Cv- Test

  • Machine learning algorithms

    12

    What to measure –• Application of machine learning to select markers for process monitoring• Combined with metabolomics to incorporate biological function

    Marker Function

    1 Proline Amino acid 812 Glucose Glycolosis 753 Hydroxyphenyl lactate lactic acid 714 Asparegine Nitrogen donor 675 Histidine Amino acid 666 Spermine Cholesterol metabolism 647 Malate TCA cycle by-product 568 Glutamine Amino acid 529 Glycine Formation of Serine 47

    10 Lactate Glycolosis by-product 4611 Serine Amino acid 4412 Pyruvate Glycolosis 33

    Variable importance (%)

  • Data driven decision making

    Bus

    ines

    s V

    alue

    Complexity

    Hindsight

    InsightForesight

    Descriptive AnalyticsWhat happened?

    Diagnostic AnalyticsWhy did it happen?

    Predictive AnalyticsWhat will happen?

    Prescriptive AnalyticsHow can we make it

    happen?

  • Process Analytical Technologies (PAT)

    14

    PAT is a framework for

    “designing, analysing, and controlling manufacturing through timely measurements (i.e., during processing) of critical quality and performance attributes of raw and in-process materials and processes, with the goal of ensuring final product quality”

    The aim of PAT is to obtain better process control by

    • identifying and managing sources of variability,

    • reducing cost by optimising the use of raw materials

    • minimising product cycle times through the use of measurements that are • in-line (analysed in place),

    • on-line (sample removed, analysed and returned to the process stream)

    • at-line (sample removed and analysed close to the process stream)

    FDA - PAT—a framework for innovative pharmaceutical development, manufacturing and quality assurance. 2004

  • Real time data acquisition

    Raman SpectroscopyRaman Spectroscopy is a technique used to observe molecular vibrations that can identify and quantitate molecules

    By measuring changes in the wavelength of laser light its possible to identify what molecules are present in the cell culture media

    • Spectra collected at regular intervals.

    • Matched to nearest offline / at-line sampling time points.

    • Pre-processed using a number of algorithms.

    • Modelling of at-line and offline data performed using multivariate chemometric methods such as projection to latent structures (PLS) and artificial neural networks.

  • -1

    0

    1

    2

    3

    4

    5

    0 1 2 3 4 5 6 7 8 9 10 11 12

    glucose

    -1

    0

    1

    2

    3

    4

    0 1 2 3 4 5 6 7 8 9 10 11 12

    lactate

    00.20.40.60.8

    11.21.4

    0 1 2 3 4 5 6 7 8 9 10 11 12

    glutamine

    0

    20

    40

    60

    80

    100

    0 1 2 3 4 5 6 7 8 9 10 11 12

    ammonia

    days days

    conc

    entr

    atio

    n (g

    /L)

    conc

    entr

    atio

    n (g

    /L)

    days days

    conc

    entr

    atio

    n (g

    /L)

    conc

    entr

    atio

    n (m

    g/L)

    Raman spectroscopy

    Process run 1Process run 2Process run 3Process run 4Raman Data

    -1

    0

    1

    2

    3

    4

    5

    0 1 2 3 4 5 6 7 8 9 10 11 12

    glucose

    Offline Data

  • Raman model development

    17

    Run 1Run 2Run 3Run 4Run 5Run 6Run 7Run 8Run 9Run 10Run 11Run 12

    Metabolomics Data Raman spectroscopy

    Random selection of ~20% of the data to allow Raman model development

  • Raman model development

    Run 1Run 2Run 3Run 4Run 5Run 6Run 7Run 8Run 9Run 10Run 11Run 12

    Training data Test data Training data Test data

    x50PLS regression modelling for each metabolite

    Model building was performed using 5-fold cross-validation.

  • Raman model development

    19

    Example of model performance for main metabolic markers:

    𝑅𝑅2 = 1 −𝑅𝑅𝑐𝑐𝑝𝑝𝑝𝑝𝑝𝑝𝐴𝐴𝐴𝐴𝑇𝑇 𝑆𝑆𝐴𝐴𝑆𝑆 𝑁𝑁𝑜𝑜 𝑆𝑆𝑆𝑆𝐴𝐴𝐴𝐴𝐴𝐴𝑐𝑐𝑝𝑝𝑇𝑇𝑁𝑁𝑐𝑐𝐴𝐴𝑇𝑇 𝑆𝑆𝐴𝐴𝑆𝑆 𝑁𝑁𝑜𝑜 𝑆𝑆𝑆𝑆𝐴𝐴𝐴𝐴𝐴𝐴𝑐𝑐𝑝𝑝

    • ~55 metabolites have R2 >0.85.

    • ~88 metabolites have R2 >0.80.

    Analyte Raman R2 Raman SD

    Glucose 0.8354 0.0842

    Lactate 0.9166 0.0440

    Glutamine 0.8330 0.0846

    Glutamate 0.8490 0.0602

  • Raman model development

    20

    Performance of Raman models for marker selected using machine learning algorithm

    Marker Function

    1 Proline Amino acid 81 0.942 Glucose Glycolosis 75 0.843 Hydroxyphenyl lactate lactic acid 71 0.894 Asparegine Nitrogen donor 67 0.905 Histidine Amino acid 66 0.826 Spermine Cholesterol metabolism 64 0.587 Malate TCA cycle by-product 56 0.928 Glutamine Amino acid 52 0.839 Glycine Formation of Serine 47 0.92

    10 Lactate Glycolosis by-product 46 0.9211 Serine Amino acid 44 0.8812 Pyruvate Glycolosis 33 0.87

    Variable importance (%)

    Raman Mean R2

  • Real time monitoring of viral titre

    21

  • Data driven decision making

    Bus

    ines

    s V

    alue

    Complexity

    Hindsight

    InsightForesight

    Descriptive AnalyticsWhat happened?

    Diagnostic AnalyticsWhy did it happen?

    Predictive AnalyticsWhat will happen?

    Prescriptive AnalyticsHow can we make it

    happen?

  • Going forward

    Device manager

    Data Parser

    Digital twin

    Data Processing

    Manual upload

    Data store

    AnthaHubfile watcher

  • Data acquisition and processing using Antha

    24

  • Real time monitoring of metabolites

    25

  • Real-time online monitoring

    26

  • Future looking - In-silico digital twin to support process improvement

    27

    Targeted and reduced experimentation

    Robust, optimised processes

    Improved process understanding

    Soft sensors and augmented reality

    In-silico modelling, process simulation, CFD

    Digital Twin

  • Acknowledgements

    John ChurchwellOliver DewhirstIsobelle Evie Marco BaradezAnna Williams

    Peter JonesNicholas Clarkson Laura McCloskeyLee DaviesRui SanchesThomas WilliamsRobert LawAnn Justino

    Markus GershaterJames ArpinoJames RutleyAllen Shaw

    Improving Viral Manufacture through Advanced PATThe Cell and Gene Therapy CatapultArtificial IntelligenceAI application – cell therapyAI and big dataData generation – biopharmaceutical productionData driven processingData driven decision makingData driven gene therapy bioprocessingCell metabolism Supervised learning: model performance and optimisation Machine learning algorithmsData driven decision makingProcess Analytical Technologies (PAT)Real time data acquisitionRaman spectroscopyRaman model developmentRaman model developmentRaman model developmentRaman model developmentReal time monitoring of viral titreData driven decision makingGoing forwardData acquisition and processing using AnthaReal time monitoring of metabolitesReal-time online monitoring Future looking - In-silico digital twin to support process improvementAcknowledgements