Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR...

53
Structure and Motion Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR Meeting on November 14, 2002
  • date post

    21-Dec-2015
  • Category

    Documents

  • view

    215
  • download

    1

Transcript of Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR...

Page 1: Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR Meeting on November 14, 2002.

Structure and MotionStructure and MotionJean-Claude Latombe

Computer Science Department Stanford University

NSF-ITR Meeting on

November 14, 2002

Page 2: Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR Meeting on November 14, 2002.

Stanford’s ParticipantsStanford’s Participants PI’s: L. Guibas, J.C. Latombe, M. Levitt Research Associate: P. Koehl Postdocs: F. Schwarzer, A. Zomorodian Graduate students: S. Apaydin (EE), S. Ieong

(CS), R. Kolodny (CS), I. Lotan (CS), A. Nguyen (Sc. Comp.), D. Russel (CS), R. Singh (CS), C. Varma (CS)

Undergraduate students: J. Greenberg (CS),E. Berger (CS)

Collaborating faculty: A. Brunger (Molecular & Cellular Physiology) D. Brutlag (Biochemistry) D. Donoho (Statistics) J. Milgram (Math) V. Pande (Chemistry)

Page 3: Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR Meeting on November 14, 2002.

Problems AddressedProblems Addressed

Biological functions derive from the structures (shapes) achieved by molecules through motions

Determination, classification, and prediction of 3D protein structures

Modeling of molecular energy and simulation of folding and binding motion

Page 4: Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR Meeting on November 14, 2002.

What’s New for Computer What’s New for Computer Science?Science?

Massive amount of experimental dataImportance of similaritiesMultiple representations of structure

Continuous energy functionsContinuous energy functions Many objects forming deformable chains

Many degrees of freedomMany degrees of freedom

Ensemble properties of pathwaysEnsemble properties of pathways

Page 5: Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR Meeting on November 14, 2002.

Massive amount of Massive amount of experimental dataexperimental data

Abstract/simplify data sets into compact data structures

E.g.: Electron density map Medial axis

Page 6: Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR Meeting on November 14, 2002.

Importance of similaritiesImportance of similarities

Segmentation/matching/scoring techniques

data set

clustereddata

smalllibrary

E.g.: Libraries of protein fragments[Kolodny, Koehl, Guibas, Levitt, JMB (2002)]

Page 7: Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR Meeting on November 14, 2002.

1tim Approximations

Complexity 10 (100 fragments of length 5)0.9146A cRMS

Complexity 2.26 (50 fragments of length 7)2.7805A cRMS

real protein

Page 8: Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR Meeting on November 14, 2002.

Alignment of Structural Motifs [Singh and Saha; Kolodny and Linial]

Problem: Determine if two structures share common

motifs:•2 (labelled) structures in R3

A={a1,a2,…,an}, B={b1,b2,…,bm}

•Find subsequences sa and sb s.t the substructures{asa(1),asa(2),…, asa(l)}{bsb(1),bsb(2),…, bsb(l)}

are similar Twofold problem: alignment and

correspondence Score Approximation Complexity

Page 9: Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR Meeting on November 14, 2002.

Iterative Closest Point (Besl-McKay) for alignment:

[R. Singh and M. Saha. Identifying Structural Motifs in Proteins.Pacific Symp. on Biocomputing, Jan. 2003.]

Score: RMSD distance

Page 10: Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR Meeting on November 14, 2002.

[R. Singh and M. Saha. Identifying Structural Motifs in Proteins.Pacific Symp. on Biocomputing, Jan. 2003.]

Trypsin

Trypsinactivesite

Page 11: Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR Meeting on November 14, 2002.

[R. Singh and M. Saha. Identifying Structural Motifs in Proteins.Pacific Symp. on Biocomputing, Jan. 2003.]

Trypsin active site against 42Trypsin like proteins

Page 12: Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR Meeting on November 14, 2002.

Multiple representations of Multiple representations of structurestructure

ProShape software[Koehl, Levitt (Stanford),Edelsbrunner (Duke)]

Page 13: Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR Meeting on November 14, 2002.

Decoys generated using “physical” potentials

Select best decoys using distance information

Statistical potentials for proteins based on alpha complex [Guibas, Koehl, Zomorodian]

Page 14: Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR Meeting on November 14, 2002.

Many pairs of objects, but relatively few are close enough to interact

Data structures that capture proximity, but undergo small or rare changes

During motion simulation - detect steric clashes (self-collisions)- find pairs of atoms closer than cutoff

Continuous energy functionsContinuous energy functions Many objects in deformable chainsMany objects in deformable chains

Page 15: Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR Meeting on November 14, 2002.

Other application domains:

Modular reconfigurable robots

Reconstructive surgery

Page 16: Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR Meeting on November 14, 2002.

Fixed Bounding-Volume hierarchies don’t work

sec17

Page 17: Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR Meeting on November 14, 2002.

Instead, exploit what doesn’t change: chain topology

Adaptive BV hierarchies[Guibas, Nguyen, Russel, Zhang] [Lotan, Schwarzer, Halperin, Latombe] (SOCG’02)

sec17

Page 18: Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR Meeting on November 14, 2002.

Wrapped bounding sphere hierarchies[Guibas, Nguyen, Russel, Zhang] (SoCG 2002)

•WBSH undergoes small number of changes•Self-collision:O(n logn ) in R2 O(n2-2/d) in Rd, d 3

Page 19: Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR Meeting on November 14, 2002.

ChainTrees[Lotan, Schwarzer, Halperin, Latombe] (SoCG’02)

Assumption: Few degrees of freedom change at each motion step (e.g., Monte Carlo simulation)

Find all pairs of atoms closer than a given cutoff Find which energy terms can be reused

Page 20: Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR Meeting on November 14, 2002.

ChainTrees[Lotan, Schwarzer, Halperin, Latombe] (SoCG’02)

logN

O mm

43( )N

Updating:

Finding interacting pairs:

(in practice, sublinear)

Page 21: Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR Meeting on November 14, 2002.

ChainTreesApplication to MC simulation (comparison to grid method)

(68) (144) (374) (755) (68) (144) (374) (755)

m=1 m = 5

Page 22: Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR Meeting on November 14, 2002.

Future work: ChainTrees

Open problem: How to find good moves to make when the conformation is compact and random moves are rejected with high probability?

Run new series of experiments with more complex energy field: EEF1 [Lazaridis & Karplus] (with Pande) Use library of fragments (with Koehl)

Page 23: Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR Meeting on November 14, 2002.

Capture proximity information with a sparse spanner

3HVT

Future Work: Spanner for deformable chain[Agarwal, Gao, Duke; Nguyen, Zhang, Stanford]

Page 24: Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR Meeting on November 14, 2002.

Many degrees of freedomMany degrees of freedom

Tools to explore large dimensional conformation space:

- Sampling strategies - Nearest neighbors

Page 25: Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR Meeting on November 14, 2002.

Sampling structures by combining fragments[Kolodny, Levitt]

a

bc d

cabbbc

Library of protein fragments

Discrete set of candidate structures

Page 26: Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR Meeting on November 14, 2002.

Find k nearest neighbors of a given protein conformation in a set of n conformations (cRMS, dRMS)

a0

a1

am

a6

a5a4

a3

a2

Idea: Cut backbone into m equal subsequences

Nearest neighbors in high-dimensional space[Lotan and Schwarzer]

Page 27: Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR Meeting on November 14, 2002.

Nearest neighbors in high-dimensional space[Lotan and Schwarzer]

Full rep., dRMS (brute force) ~84h

Ave. rep., dRMS (brute force) : ~4.8h

SVD red. rep., dRMS (brute force) 41min

SVD red. rep., dRMS (kd-tree) 19min

100,000 decoys of 1CTF (Park-Levitt set)Computation of 100 NN of each conformation

~80% of computed NNs are true NNskd-tree software from ANN library (U. Maryland)

Page 28: Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR Meeting on November 14, 2002.

Ensemble properties of Ensemble properties of pathwayspathways

Stochastic nature of molecular motion requires characterizing average properties of many pathways

Page 29: Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR Meeting on November 14, 2002.

Example #1: Probability of Folding pfold

Unfolded set Folded set

pfold1- pfold

“We stress that we do not suggest using pfold as a transition coordinate for practical purposes as it is

very computationally intensive.” Du, Pande, Grosberg, Tanaka, and Shakhnovich “On the Transition

Coordinate for Protein Folding” Journal of Chemical Physics (1998).

HIV integrase[Du et al. ‘98]

Page 30: Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR Meeting on November 14, 2002.

Example #2: Ligand-Protein Interaction

[Sept, Elcock and McCammon `99]

10K to 30K independent simulations

Page 31: Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR Meeting on November 14, 2002.

vi

vj

Pij

Probabilistic Roadmap [Apaydin, Brutlag, Hsu, Guestrin, Latombe] (RECOMB’02, ECCB’02) Idea: Capture the stochastic nature of molecular motion by a network of randomly selected conformations and by assigning probabilities to edges

Page 32: Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR Meeting on November 14, 2002.

Pii

F: Folded setU: Unfolded set

Pij

i

k

j

l

m

Pik Pil

Pim

Let fi = pfold(i)After one step: fi = Pii fi + Pij fj + Pik fk + Pil fl + Pim fm

=1 =1

One linear equation per node Solution gives pfold for all nodes

No explicit simulation run All pathways are taken into account Sparse linear system

Probabilistic Roadmap [Apaydin, Brutlag, Hsu, Guestrin, Latombe] (RECOMB’02, ECCB’02)

Page 33: Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR Meeting on November 14, 2002.

Probabilistic Roadmap

Correlation with MC Approach• 1ROP

(repressor of primer)• 2 helices• 6 DOF

Page 34: Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR Meeting on November 14, 2002.

Monte Carlo:

49 conformations Over 11 days ofcomputer time

Over 106 energy

computations

Roadmap:

5000 conformations 1 - 1.5 hours ofcomputer time

~15,000 energycomputations

~4 orders of magnitude speedup!

Probabilistic Roadmap

Computation Times (1ROP)

Page 35: Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR Meeting on November 14, 2002.

Future work: Probabilistic Roadmap

Non-uniform sampling strategies Encoding molecular dynamics into probabilistic roadmaps (with V. Pande) Quantitative experiments with ligand-protein binding (with V. Pande)

Page 36: Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR Meeting on November 14, 2002.

Bio-X – Clark CenterBio-X – Clark Center

Page 37: Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR Meeting on November 14, 2002.

The following slides relate to non-research issues.I do not plan to present them. Jack and Leo may want to use the contents of some of them for their own presentations.

Page 38: Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR Meeting on November 14, 2002.

• Tutorial on Delaunay, Alpha-Shape and Pockets (Koehl)

• A biocomputing Notebook (Koehl)• Biocomputation lectures in pre-existing classes:

– CS326 – motion planning: molecular motion, probabilistic roadmaps, self-collision detection (Latombe)

– CS468 – intro to computational topology: finding pockets and tunnels in molecules, compute surface areas and volumes and their derivative (Zomorodian)

• New class on Algorithmic Biology (Batzoglu, Guibas, Latombe)

• Graduate Curriculum Committee, Bio-Engineering Dept., Stanford (Latombe)

EducationEducation

Page 39: Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR Meeting on November 14, 2002.

PhD studentsSerkan Apaydin, EEAn Nguyen, Scientific ComputingCarlos Guestrin, CS (Daphne Koller’s group)Itay Lotan, CSRachel Kolodny, CSDaniel Russel, CSSamuel Ieong, CS

Trained Students (1/2)Trained Students (1/2)

Most graduate students have a principal advisor in CS and a secondaryone in a bio-related department (Levitt, Brutlag, Pande)

Page 40: Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR Meeting on November 14, 2002.

Graduated Master studentsRohit Singh, finding motifs in proteins, best Stanford CS master’s thesis, June ’02 [current position: bioinformatics company in San Diego]Chris Varma, study of ligand-protein interaction with probabilistic roadmaps, June ’02 [current position: PhD student, Harvard/MIT Biomedical program]

Current Master studentBen Wong, modeling T cell activity

UndergraduateEric Berger, CS, Stanford, summer internship Julie Greeberg, CS, Harvard, summer internship

Trained Students (2/2)Trained Students (2/2)

Page 41: Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR Meeting on November 14, 2002.

• Prof. Alberto MunozMath Dept., University of Yucatan, Mexico3 months, Summer’02Haptic interaction and probabilistic roadmaps

• Prof. Ileana StreinuSmith College6 months, from Sept.’02Protein folding

VisitorsVisitors

Page 42: Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR Meeting on November 14, 2002.

- Guibas and Levitt, with J. Milgram (Math): topology of configuration spaces of chains

- Guibas, with V. Pande (Chemistry) and D. Donoho (Statistics) non-linear multi-resolution analysis of molecular motions

- Latombe and Apaydin, with D. Brutlag (Biochemistry) and V. Pande: probabilistic roadmaps

- Latombe and Lotan with V. Pande: efficient MC simulation

Interactions Within StanfordInteractions Within Stanford

Page 43: Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR Meeting on November 14, 2002.

- Collision Detection for Deforming Necklaces, P. Agarwal, L. Guibas, A. Nguyen, D. Russel, and L. Zhang. Invited to special issue of Comp. Geom., Theory and Applications, following presentation at SoCG'02.- Kinetic Medians and kd-Trees, P. Agarwal, J. Gao, and L. Guibas. Proc. 10th European Symp. Algorithms, LNCS 2461, Springer-Verlag, 5-16, 2002.- Stochastic Roadmap Simulation: An Efficient Representation and Algorithm for Analyzing Molecular Motion, M.S. Apaydin, D.L. Brutlag, C. Guestrin, D. Hsu, and J.C. Latombe. Proc. RECOMB'02, Washington D.C., pp. 12-21, 2002. - Efficient Maintenance and Self-Collision testing for Kinematic Chains, I. Lotan, F. Schwarzer, D. Halperin, and J.C. Latombe, SoCG’02, pp. 43-42. June 2002.- Stochastic Conformational Roadmaps for Computing Ensemble Properties of Molecular Motion, M.S. Apaydin, D.L. Brutlag, C. Guestrin, D. Hsu, and J.C. Latombe. Workshop on Algorithmic Foundations of Robotics (WAFR), Nice, Dec. 2002.

Interactions Outside StanfordInteractions Outside Stanford

Page 44: Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR Meeting on November 14, 2002.

- BCATS ‘01 and ‘02 [Bio-Computation At Stanford]

- RECOMB ’02 [Int. Conf. on Research in Computational Biology]

- ISMB ‘02 [Int. Conf. on Intelligent Syst. for Molecular Biology]

- ECCB 2002 [European Conf. on Computational Biology]

- Biophysical Society Symp. on Molecular Simulations in Structural Biology, 2002- SoCG 2002 [ACM Symp. on Computational Heometry]

Attendance to ConferencesAttendance to Conferences

Page 45: Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR Meeting on November 14, 2002.

- Latombe and Levitt serve as members of the Scientific Leadership Council of Stanford’s Bio-X program- Presentations: Stanford’s Bio-X Symposium (3/02), Stanford’s Computer Forum (3/02), Berkeley’s Broad Area Seminar (4/02)- Conference committees:

Guibas, program committee, WAFR’02 and SoCG’03 Latombe, program committee, 1st IEEE Bioinformatics Conf. ‘03

Apaydin, organization committee of BCATS’02

OutreachOutreach

Page 46: Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR Meeting on November 14, 2002.

The following slides are extra slides that I removed from my presentation for lack of time

Page 47: Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR Meeting on November 14, 2002.

General GoalsGeneral Goals

Larger proteins considered computational efficiency

Diversity of molecules and interactions computational abstractions

Extension of in-silico experiments computational correctness

Enable biological studies that were not possible before, more systematically

Page 48: Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR Meeting on November 14, 2002.

ApproachApproach

Select hard problemsClose interaction between computer scientists (Guibas, Koehl, Latombe) and biologists (Koehl, Levitt, Brutlag, Pande, Brunger)Most graduate students are CS students with secondary advisor in biologyPerform extensive tests

Page 49: Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR Meeting on November 14, 2002.

Electron density map Medial axis[Guibas, Brunger, Russel]

Medial axis of iso-surfaces to estimate backbone

Cleaning and simplification of axis to filter noise out

Persistence of features across multiple iso-surfaces

sec17

Page 50: Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR Meeting on November 14, 2002.

Continuous energy functionContinuous energy function

Essential for protein structure prediction and molecular motion simulation:- Statistical potentials based on alpha complex- Maintenance of energy values during simulation

Page 51: Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR Meeting on November 14, 2002.

Instead, exploit what doesn’t change: chain topology

Adaptive BV hierarchiesBalanced binary trees of constant

topologyEfficient repair of position/size of BVs

[Guibas, Nguyen, Russel, Zhang] [Lotan, Schwarzer, Halperin, Latombe] (SOCG’02)

sec17

Page 52: Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR Meeting on November 14, 2002.

Future Work:Spanner for deformable chain[Agarwal, Gao, Duke; Nguyen, Zhang, Stanford]

Page 53: Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR Meeting on November 14, 2002.

• 1ROP (repressor of primer)

• 2 helices• 6 DOF

• 1HDD (Engrailed homeodomain)

• 3 helices• 12 DOF

H-P energy model with steric clash exclusion [Sun et al., 95]

Probabilistic Roadmap