Presented by Jennelle Heyer and Jonathan Ebbers December 7, 2004

22
Protein Modularity and Evolution: An examination of organism complexity via protein domain structure Presented by Jennelle Heyer and Jonathan Ebbers December 7, 2004

description

Protein Modularity and Evolution : An examination of organism complexity via protein domain structure. Presented by Jennelle Heyer and Jonathan Ebbers December 7, 2004. Presentation Outline. Background Material - Protein Evolution, Theory of Domains, Gene Number - PowerPoint PPT Presentation

Transcript of Presented by Jennelle Heyer and Jonathan Ebbers December 7, 2004

Page 1: Presented by Jennelle Heyer and Jonathan Ebbers December 7, 2004

Protein Modularity and Evolution:

An examination of organism complexity via protein domain structure

Presented byJennelle Heyer and Jonathan Ebbers

December 7, 2004

Page 2: Presented by Jennelle Heyer and Jonathan Ebbers December 7, 2004

Presentation Outline

• Background Material - Protein Evolution, Theory of Domains,

Gene Number

• Hypothesis- Using a model protein family

• Procedure/Methods - DPIP Program, Phylogenic Analysis

• Results• Discussion/Conclusions

Page 3: Presented by Jennelle Heyer and Jonathan Ebbers December 7, 2004

Theories of Protein Evolution

A long time ago, in the primodial soup of life, small polypeptides began to form…

HDLC or TCP or….

HDLC + TCP = HCLCTCP

HCI*CTCP + TCP…

Functional proteins

HDLC or TCP or….

HDLC + TCP = HCLCTCP

HCI*CTCP + QZX…

Functional proteins

Page 4: Presented by Jennelle Heyer and Jonathan Ebbers December 7, 2004

Concept of Modularity

• Proteins consist of one or more domains that were pieced together over time

• Domain building blocks of proteins– Defined as “spatially distinct structures

that could conceivably fold and function in isolation” (Pontig and Russell, 2002)

– Dictate the function of the protein– Evolutionary pressure to conserve

(sequence and/or structure)

Page 5: Presented by Jennelle Heyer and Jonathan Ebbers December 7, 2004

Organismal Complexity

• The nematode, C. elegans, has 19,500 genes in its genome

• Humans have between 20,000 and 25,000 genes in their genome

• HOW CAN THAT BE?• Alternate splicing, multi-functional/network

proteins

Page 6: Presented by Jennelle Heyer and Jonathan Ebbers December 7, 2004

Hypothesis

• Gene products, proteins, can be multi-functional with the introduction of domains

• “…evolution does not produce innovation from scratch. It works on what already exists, either transforming a system to give it a new function or combining several systems to produce a more complex one” (Jacob, 1946)

• More complex or phylogenetically derived organisms produce proteins with greater domain complexity

Page 7: Presented by Jennelle Heyer and Jonathan Ebbers December 7, 2004

Hypothesis Part II

• Create a protein domain “tool” – Position– Partner domain– General organization– Protein evolution– Using a variety of sequenced genomes

• Allow investigators to learn about domain of interest and apply to research

Page 8: Presented by Jennelle Heyer and Jonathan Ebbers December 7, 2004

Kinesins: A model protein family

• Motor proteins found in eukaryotic organisms

• Contain a conserved motor domain

• Bind and walk along microtubules

• Can carry a variety of “cargo”

• May contain multiple domains

http://www.mb.tn.tudelft.nl/projects/

Page 9: Presented by Jennelle Heyer and Jonathan Ebbers December 7, 2004

Kinesins: A model protein family

• Arabidopsis thaliana, a model plant species, contains 61 kinesins

• S. pombe – 10, C. elegans – 22, Drosophlia – 25,

Human and mouse ~ 45

From Reddy and Day, 2001

Page 10: Presented by Jennelle Heyer and Jonathan Ebbers December 7, 2004

Programming Approach

• Two programs used, BLAST and InterProScan, held together with perl scripts

• Give a domain sequence to PSI-BLAST, which will identify proteins that have that domain.

• One by one, give those protein sequences to IPR, which identifies domains in the protein.

• Create a listing of proteins and map the data into a phylogeny.

• Create a tree based on the phylogeny and domains

Page 11: Presented by Jennelle Heyer and Jonathan Ebbers December 7, 2004

DomainSequence

List of proteinswith similar domains

List of domains inevery protein

Tree(includes domains)

BLAST

InterProScan

Maketree

Program Flowchart

Page 12: Presented by Jennelle Heyer and Jonathan Ebbers December 7, 2004

Program Details

• Database selection:– BLAST: Refseq over nr– InterProScan: SMART database, only

• Threshold values:– BLAST: Option to change, improve resolution– InterProScan: E-value at 0.99, up from 0.01

• Used Arabidopsis sequences as a control• Name: DPIP (Domain Placement in

Proteins)

Page 13: Presented by Jennelle Heyer and Jonathan Ebbers December 7, 2004

Results

• A Quick Look at the Data

• Phylogenetic Approach– Hypothesis I

• Qualitative Approach– Hypothesis II

Page 14: Presented by Jennelle Heyer and Jonathan Ebbers December 7, 2004

                               

               

A Quick L k

Page 15: Presented by Jennelle Heyer and Jonathan Ebbers December 7, 2004

Phylogenetic Approach• “More complex or phylogenically derived

organisms produce proteins with greater domain complexity”

• Trace domain characteristics on a preset tree– Use MacClade tree drawing software– Uses input data to create most parsimonious

trace

• Characteristics: Maximum # domains

Unique domains

Page 16: Presented by Jennelle Heyer and Jonathan Ebbers December 7, 2004

Maximum # of Domains per Protein

Green = 1Black = 3

Page 17: Presented by Jennelle Heyer and Jonathan Ebbers December 7, 2004

Number of Unique Domains per Organism

Blue = 1Pink = 2Dk. Blue = 3Yellow = 5Black = 6Dash - ???

Page 18: Presented by Jennelle Heyer and Jonathan Ebbers December 7, 2004

Phylogenetic Conclusions

• Inconclusive or null hypothesis supported• Possible explanations:

– Kinesins may have limited domain complexity due to function or folding

– Inherent bias in DPIP (refseq database)

• Future Work:– Testing other domains through same process– Updating database– Include measure for position (N/I/C)

Page 19: Presented by Jennelle Heyer and Jonathan Ebbers December 7, 2004

Qualitative Approach

• Create a protein domain “tool” – Position– Partner domain– General organization– Protein evolution– Using a variety of sequenced

genomes

• Compile data into a more informative table

Page 20: Presented by Jennelle Heyer and Jonathan Ebbers December 7, 2004

- Can I trace domain or protein evolution??

Page 21: Presented by Jennelle Heyer and Jonathan Ebbers December 7, 2004

Presence of FHA/PH domain in kinesins

Yellow – AbsentBlue - Present

Page 22: Presented by Jennelle Heyer and Jonathan Ebbers December 7, 2004

Conclusions• DPIP program was created to answer two

questions:– Does organismal complexity correspond with

protein complexity?– Can we create a tool for researched to better

understand domain in protein families?

• For kinesins motor domains: No and Yes• For other domains:????

Thanks to Webb Miller, Richard Cyr Claude DePamphillis, Alexander Richter, Plant Physiology, Biology, and Bioinformatics Depts.