Native-Source Structural Proteomics Protein Structure Initiative Bottlenecks Workshop April 15 th,...

23
Native-Source Structural Proteomics Protein Structure Initiative Bottlenecks Workshop April 15 th , 2008 Nathaniel Echols*, Monica Totir*, Andrew May#, Chloe Zubieta*, Alisa Moskaleva*, Tom Alber* * UC Berkeley # Fluidigm Corporation

Transcript of Native-Source Structural Proteomics Protein Structure Initiative Bottlenecks Workshop April 15 th,...

Page 1: Native-Source Structural Proteomics Protein Structure Initiative Bottlenecks Workshop April 15 th, 2008 Nathaniel Echols*, Monica Totir*, Andrew May#,

Native-Source Structural Proteomics

Protein Structure Initiative Bottlenecks WorkshopApril 15th, 2008

Nathaniel Echols*, Monica Totir*, Andrew May#, Chloe Zubieta*,Alisa Moskaleva*, Tom Alber*

* UC Berkeley# Fluidigm Corporation

Page 2: Native-Source Structural Proteomics Protein Structure Initiative Bottlenecks Workshop April 15 th, 2008 Nathaniel Echols*, Monica Totir*, Andrew May#,

Native-source structural proteomics

• Native sources provide access to samples that may be

difficult to obtain by recombinant methods

• Project goal: obtain structures of complexes and

low-abundance proteins

1. Scale up purification (>100 g protein)

2. Scale down crystallization (picoliter

reactions)

• No cloning, no overexpression.

Page 3: Native-Source Structural Proteomics Protein Structure Initiative Bottlenecks Workshop April 15 th, 2008 Nathaniel Echols*, Monica Totir*, Andrew May#,

Experimental approach

• Use E. coli as a model system to develop the purification protocol necessary to go from grams of starting material to

100 μg fractions

• Screen the final samples at a concentration of >10 mg/ml in Fluidigm Topaz chips and identify the crystallizable fractions

• Identify samples by mass spectrometry

• Set the selected samples in diffraction capable chips or nanodrop crystallization trays for X-ray data collection

Page 4: Native-Source Structural Proteomics Protein Structure Initiative Bottlenecks Workshop April 15 th, 2008 Nathaniel Echols*, Monica Totir*, Andrew May#,

Proof-of-concept: the E. coli proteome• Small, well-studied proteome, but still some novelty:• 4243 predicted proteins (manageable number of molecular species)• 860 membrane proteins• 1000 proteins with > 90% sequence identity to known structures• 1250 with > 50% sequence identity• 2000 with > 30% sequence identity• Nearly 1400 uncharacterized non-membrane proteins

• Existing structures allow us to validate approach

• Easy to grow in massive quantities• Lysis and clarification are relatively simple

Page 5: Native-Source Structural Proteomics Protein Structure Initiative Bottlenecks Workshop April 15 th, 2008 Nathaniel Echols*, Monica Totir*, Andrew May#,

Proteome component sizes

Cellular protein content is dominated by large assemblies

Page 6: Native-Source Structural Proteomics Protein Structure Initiative Bottlenecks Workshop April 15 th, 2008 Nathaniel Echols*, Monica Totir*, Andrew May#,

Proteins/complexes bigger than500 kDa

Lyse at pH 7-8

Cross flow size fractionation – 500 kDa TFF

Sucrose gradients

Size exclusion chromatography

MonoQ

Proteins/complexes smaller than500 kDa

Scalable, gentle purification scheme

Purification schemeA new philosophy--keep everything--required new strategies

Superdex 200

Phenyl

MonoQ/MonoS

Capto Q

Phenyl

MonoQ/MonoS

SP Sepharose

Steps

Page 7: Native-Source Structural Proteomics Protein Structure Initiative Bottlenecks Workshop April 15 th, 2008 Nathaniel Echols*, Monica Totir*, Andrew May#,

Typical Anion Exchange chromatogram of the final samples

Blue Heparin Capto MMC

Proteins/complexes smaller than500 kDa

Superdex 200

Phenyl

MonoQ/MonoS

Purification scheme (continued)

Column size Approx. protein quantity1-2 L 50 g

300 mL 10 g

20-50 mL 1 g

1-8 mL 10-100 mg

Page 8: Native-Source Structural Proteomics Protein Structure Initiative Bottlenecks Workshop April 15 th, 2008 Nathaniel Echols*, Monica Totir*, Andrew May#,

Capto Q

Phenyl

MonoQ/MonoS

• 200 g of E. coli cells grown in M9 minimal medium and lysed

• Purification scheme:

• 272 fractions analyzed in 96-well Caliper electrophoresis robot and selected for crystallization

Caliper “gel”

The first large-scale prep

Page 9: Native-Source Structural Proteomics Protein Structure Initiative Bottlenecks Workshop April 15 th, 2008 Nathaniel Echols*, Monica Totir*, Andrew May#,

Microfluidic crystallization with the Fluidigm TOPAZ system (8.96 chips)

MS identification

Promising chip crystals

Purity checked by Caliper gel

Diffraction-capable chips96 well sitting drop

for further optimization

Sub-optimal chip crystals

MS identification

X-ray data collection

Crystallization pipeline

Page 10: Native-Source Structural Proteomics Protein Structure Initiative Bottlenecks Workshop April 15 th, 2008 Nathaniel Echols*, Monica Totir*, Andrew May#,

http://www.fluidigm.com/topaz.htm

Microfluidic crystallization• 272 samples set in Fluidigm TOPAZ 8.96 chips with

Index screen• Automated inspection and scoring required to find

crystals efficiently• 190/272 (70%) produced crystals or microcrystals in

chips (high redundancy in crystal forms)• 50 unique crystal forms by visual inspection• High-quality crystals possible even in very impure

samples

0

20

40

60

80

100

120

1.6 1.8 2 2.2 2.4 2.6

Resolution / Å

Purity / %

0

20

40

60

80

100

120

1.6 1.8 2 2.2 2.4 2.6

Resolution / Å

Purity / %

Page 11: Native-Source Structural Proteomics Protein Structure Initiative Bottlenecks Workshop April 15 th, 2008 Nathaniel Echols*, Monica Totir*, Andrew May#,

• 66 samples picked for optimization in nanodrop vapor diffusion trays (using Mosquito robot)• Protocol: sample 40%-100% precipitant concentration with different protein:well ratios (1:3, 1:1, 3:1)

• 50 of hits (76%) were reproducible by this method

Crystal optimization

Page 12: Native-Source Structural Proteomics Protein Structure Initiative Bottlenecks Workshop April 15 th, 2008 Nathaniel Echols*, Monica Totir*, Andrew May#,

ALS Beamline 8.3.1

Diffraction-capable microfluidic chips

Reagents Samples

10 nL sample chambers

“Hands-Free” data collection

Page 13: Native-Source Structural Proteomics Protein Structure Initiative Bottlenecks Workshop April 15 th, 2008 Nathaniel Echols*, Monica Totir*, Andrew May#,

Structure determination

• MS identification of unique crystals should be the first step

• 25 unique native datasets collected at ALS 8.3.1/12.3.1• 15 already published structures identified• 3 structures novel in E. coli, phased by MR

• Robotics and automation software used for data collection and processing whenever possible

Page 14: Native-Source Structural Proteomics Protein Structure Initiative Bottlenecks Workshop April 15 th, 2008 Nathaniel Echols*, Monica Totir*, Andrew May#,

Rapid structure identification by MR• Concept: identify protein from “anonymous” diffraction data (no mass spec info)

• Search set of every PDB structure homologous to an E. coli protein (~10,000 models)

• Molecular replacement rotation function run using each model

• Identical structures are usually high-scoring• Homologous proteins may still score better than average

• Potential solutions can be verified by full MR

Page 15: Native-Source Structural Proteomics Protein Structure Initiative Bottlenecks Workshop April 15 th, 2008 Nathaniel Echols*, Monica Totir*, Andrew May#,

Experimental phasing

• The largest bottleneck: much more manual labor required

• Cryoprotectants contain heavy monovalent ions (Br+, Rb-)

• Metal quick-soaks (0.5 - 5 mM):• Ethyl mercury phosphate/thimerosal• HgCl2 or PCMBS (p-Chloro-mercuric-

benzenesulphonate acid)

• SmCl3

• PtCl4, PtCl6

Page 16: Native-Source Structural Proteomics Protein Structure Initiative Bottlenecks Workshop April 15 th, 2008 Nathaniel Echols*, Monica Totir*, Andrew May#,

Cystathionine -synthase

Catalase HPII (also in truncated form)

5-keto-4-deoxyuronate isomerase Hsp31 chaperonePyruvate kinasePPIaseMolybdopterin biosynthesis prot. B

pSer aminotransferase

Arginosuccinate lyase Lysyl-tRNA synthetase

Methylglyoxal reductase(37%)

pGlucose isomerase(65%)

Dihydrodipicolinate synthase Citrate synthase

ycaC

ß-glucosidase (?) (bglA)(33%)

New:

Old:

(% identity to PDB)

Transhydrogenase domain I

(Structures labelled in red were identified by brute-force search.)

Current structures, new and old

Page 17: Native-Source Structural Proteomics Protein Structure Initiative Bottlenecks Workshop April 15 th, 2008 Nathaniel Echols*, Monica Totir*, Andrew May#,

Purity of crystallized samples

Page 18: Native-Source Structural Proteomics Protein Structure Initiative Bottlenecks Workshop April 15 th, 2008 Nathaniel Echols*, Monica Totir*, Andrew May#,

• Macro-to-micro strategy tested with E.coli

• Large-scale fractionation pipeline: • New approaches and equipment (TFF, larger columns,

Caliper CE robot) needed to scale up and keep everything• Currently 464 fractions isolated for crystallization

• Small-scale crystallization:• >50% of fractions crystallized in Topaz microfluidic

format• Many impure fractions yielded starting crystals• Optimization in sitting drops and new diffraction chips

was efficient

• Structure determination:• 25 data sets collected, 18 structures phased, all

oligomeric• 3 structures novel to E. coli• Brute-force molecular replacement was used in most cases

Summary

Page 19: Native-Source Structural Proteomics Protein Structure Initiative Bottlenecks Workshop April 15 th, 2008 Nathaniel Echols*, Monica Totir*, Andrew May#,

Future directions

• Continue improvements to purification methods

• Pathogenic organisms (e.g. Mycobacteria)• Plant/mammalian proteomes: diploid, much larger and more complex

• Smaller sets of related proteins:• Protease-resistant domains• Serum proteins• ATP-binding proteins• Metalloproteins• Large complexes

Page 20: Native-Source Structural Proteomics Protein Structure Initiative Bottlenecks Workshop April 15 th, 2008 Nathaniel Echols*, Monica Totir*, Andrew May#,

• Tom Alber, Monica Totir, Chloe Zubieta, Alisa Moskaleva

• Andy May (Fluidigm)• Scott Gradia, James Berger (UCB)• James Holton (ALS)• George Meigs, Jane Tanamatchi (ALS)

• ALS beamlines 8.3.1, 12.3.1• Tony Iavarone (QB3 MS facility)• Scripps Center for Mass Spectrometry

• W.M. Keck Foundation• Millipore Corporation• Funded in part by UC Discovery/Fluidigm Corporation and NIGMS grant GM71326-02

Acknowledgements

Page 21: Native-Source Structural Proteomics Protein Structure Initiative Bottlenecks Workshop April 15 th, 2008 Nathaniel Echols*, Monica Totir*, Andrew May#,

Second large-scale prep – a better purification scheme

• 1000 g of E.coli cells grown in M9 minimal medium and lysed

• 192 unique final samples to be screened in 8.96 chips and subsequently set upin diffraction-capable chips

Proteins/complexes smaller than500 kDa

SP Sepharose Blue Heparin Capto MMC

Lysate at pH 7

Cross flow size fractionation – 500 kDa TFF

Sucrose gradients

Size exclusion chromatography

MonoQ

Proteins/complexes bigger than500 kDa

Superdex 200

Phenyl

MonoQ/MonoS

Page 22: Native-Source Structural Proteomics Protein Structure Initiative Bottlenecks Workshop April 15 th, 2008 Nathaniel Echols*, Monica Totir*, Andrew May#,

Abundance ( # transcripts)

Apparently rare proteins accessible

# genesI will have to look this up. Or do we have smth like this?

Page 23: Native-Source Structural Proteomics Protein Structure Initiative Bottlenecks Workshop April 15 th, 2008 Nathaniel Echols*, Monica Totir*, Andrew May#,