Bringing reason to phenotype diversity, character change, and common descent

50
Bringing reason to phenotype diversity, character change, and common descent Hilmar Lapp National Evolutionary Synthesis Center (NESCent) NCBO Webinar, Nov 17, 2010

description

Talk I gave in the National Center for BioOntologies (NCBO) Webinar series, on Nov 17, 2010. Abstract, bio, and video recording are at the NCBO website: http://www.bioontology.org/phenoscape

Transcript of Bringing reason to phenotype diversity, character change, and common descent

Page 2: Bringing reason to phenotype diversity, character change, and common descent

Life has evolved a stunning

diversity of phenotypes

Images: Web Tree of Life (http://tolweb.org)

Parfrey et al (2010, Parfrey & KatzRegier et al (2010

Page 3: Bringing reason to phenotype diversity, character change, and common descent

Large body of evolutionary phenotype documentation

Page 4: Bringing reason to phenotype diversity, character change, and common descent

Sereno (1999)

Mabee (2000)

from: Understanding Evolution

Chen & Mayden (2010)

Phenotype changes inform

phylogenetic reconstruction

Page 5: Bringing reason to phenotype diversity, character change, and common descent

As complex, free text phenotypes are resistant to computing

(Lundberg and Akama 2005)

Page 6: Bringing reason to phenotype diversity, character change, and common descent

Finding similar information in free-text is difficult

“lacrymal bone...flat’’ Mayden 1989

“lacrimal...small, flat” Grande and Poyato-Ariza 1999

“lacrimal...triangular’’ Royero 1999

“first infraorbital (lachrimal) shape...flattened” Kailola 2004

“fourth infraorbital...anterior and posterior margins...in parallel” Zanata and Vari 2005

Page 7: Bringing reason to phenotype diversity, character change, and common descent

Computing example: Search by Similarity

Fig. 3, Washington et al (2009)

Fig. 1, Washington et al (2009)

Page 8: Bringing reason to phenotype diversity, character change, and common descent

Computing example: Search by Similarity

Fig. 3, Washington et al (2009)

Fig. 1, Washington et al (2009)

Trogloglanis pattersoni - a blind catfish

http://tolweb.org/Trogloglanis/69910

Page 9: Bringing reason to phenotype diversity, character change, and common descent

Integrating across studies?

Fig. 7, Sereno (2009)

Fig. 6, Sereno (2009)

Page 10: Bringing reason to phenotype diversity, character change, and common descent

Computing over comparative morphology?

Cyprinus carpio Pangio anguillaris Nemacheilus fasciatus

Catostomus commersoni Gyrinocheilus aymonieri Phenacogrammus interruptus

Page 11: Bringing reason to phenotype diversity, character change, and common descent

Knowledge mining & hypothesis generation

Mutagenesis

Model Organism

Laue et al (2008)

Mutant or missing protein at specific developmental stage

Phenotype change(s) to wildtype

Order Siluriformes

Pimelodus maculatus

spinelet middle nuchal plate

anterior

nuchal plate

Order Characiformes

Catoprion mento

abdominal

scutes

predorsal

spine

2 cm

Non-model organisms

Mutation, selection, drift, gene flow

Altered expression or function of protein

Phenotype changes between evolutionary lineages

Page 12: Bringing reason to phenotype diversity, character change, and common descent

Phenoscape• Collaboration between P. Mabee (PI, U. South

Dakota), M. Westerfield (ZFIN), and Todd Vision (UNC, NESCent)

• Aim: Foster devo-evo synthesis by

• Prototyping a database of curated, machine-interpretable evolutionary phenotypes.

• Integrating these with mutant phenotypes from model organisms.

• Enabling data-mining and discovery for candidate genes of evolutionary phenotype transitions.

• Informatics for the project is developed and hosted at NESCent

Page 13: Bringing reason to phenotype diversity, character change, and common descent

supraorbital bone

supraorbital bone shape

Entity (TAO)

bent

Character

bent

Quality (PATO)

State

Entity-Quality Model for Evolutionary Phenotypes

Page 14: Bringing reason to phenotype diversity, character change, and common descent

supraorbital bone

supraorbital bone shape

Entity (TAO)

bent

Character

bent

Quality (PATO)

State

Entity-Quality Model for Evolutionary Phenotypes

Page 15: Bringing reason to phenotype diversity, character change, and common descent

supraorbital bone

supraorbital bone shape

Entity (TAO)

bent

Character

bent

Quality (PATO)

State

Entity-Quality Model for Evolutionary Phenotypes}

Phenotype

Page 16: Bringing reason to phenotype diversity, character change, and common descent

Brycinus brevis

bentinheres_insupraorbital

bone

exhibits some

Phenotype Assertion

Taxon ontology term

Anatomy ontology term

Phenotypic Quality ontology term

Links a taxon to a phenotype

Links a quality to the entity that is its bearer

Page 17: Bringing reason to phenotype diversity, character change, and common descent

Brycinus brevis

bentinheres_insupraorbital

bone

exhibits some

Phenotype Assertion

Taxon ontology term

Anatomy ontology term

Phenotypic Quality ontology term

Links a taxon to a phenotype

Links a quality to the entity that is its bearer

Evidence Code Specimen Publication

Page 18: Bringing reason to phenotype diversity, character change, and common descent

Ontology development

Cover art: K. LuckenbillDahdul et al (2009)

Ontologies for

• Teleost Anatomy

• Teleost Taxonomy

• Phenotypic Quality (PATO)

Page 19: Bringing reason to phenotype diversity, character change, and common descent

CurationDahdul et al., 2010 PLoS ONE

2. Students: Manual entry of free

text character descriptions, matrix, taxon list, specimens

and museum numbers using Phenex

3. Character annotation by experts: Entry of phenotypes

and homology assertions using

Phenex

Curators:Wasila DahdulMiles CoburnJeff EngemenTerry GrandeEric HiltonJohn LundbergPaula MabeeRichard MaydenMark Sabaj Pérez

~ 5 person years

1. Students: gather publications (scan hard copies,

produce OCR PDFs)

4. Consistency checks, upload of

data to public view of Phenoscape KB

Page 20: Bringing reason to phenotype diversity, character change, and common descent
Page 21: Bringing reason to phenotype diversity, character change, and common descent

• Curated 4,208 characters in 2,310 species from 51 papers

• 333,987 evolutionary phenotype assertions

• 11,267 phenotype statements about 2,953 genes

Page 22: Bringing reason to phenotype diversity, character change, and common descent

Phenoscape Knowledgebase

Page 23: Bringing reason to phenotype diversity, character change, and common descent
Page 24: Bringing reason to phenotype diversity, character change, and common descent
Page 25: Bringing reason to phenotype diversity, character change, and common descent
Page 26: Bringing reason to phenotype diversity, character change, and common descent
Page 27: Bringing reason to phenotype diversity, character change, and common descent

legacy free-text character data

EQ = body lacks all parts of type scale

EQ = body has fewer parts of type scale

mutant phenotype

Kailola (2004)

Harris et al. (2008)

Here, we describe the phenotypic and molecular characterization of a set of mutants showing loss of adult structures of the dermal skeleton, such as the rays of the fins and the scales, as well as the pharyngeal teeth. The mutations represent adult-viable, loss of function alleles in the ectodysplasin (eda) and ectodysplasin receptor (edar) genes.

is_a

exhibits

exhibits

Siluriformes

is_a

Teleostei

is_a

is_a

edadt3S243X/+

eda

variant_of

body has fewer parts of type scale

influences

towards

towards

towards

is_a

is_ais_a

has number of

is_a

Taxon Anatomical Entity QualityGeneanatomical structure

inheres_in

inheres_in

inheres_in

is_ais_a

Gasterosteiformes

body

body lacks all parts of type scale

body lacks all parts of type scale

lacks all parts of type

scale

is_a

Ictalurus punctatus

Apeltes quadracus

has fewer parts of type

© Jean Ricardo Simões Vitule

Full workflow:free-text → EQ → integrated KB

Page 28: Bringing reason to phenotype diversity, character change, and common descent

System architecture

Phenotypic Quality Ontology

(PATO)

Teleost Taxonomy Ontology (TTO)

Knowledgebase (OBD)(PostgreSQL)

Genes & genotypes

OBD Programming API(Java)

Knowledgebase Data Services API (REST)

Knowledgebase User IntefaceWeb Application for Exploration & Mining

(Ruby on Rails, JavaScript)

OBD Reasoner

Phenex(Evolutionary EQ

annotation)

Homology assertions

Mutant EQ phenotypes from Zebrafish Model Organism Database

External web sites and client

applications

Skeletal Character Data(from phylogenetic

treatments in literature)

OBO LibraryNeXMLEvolutionary EQ Phenotypes

(through annotation)

Anatomy Ontologies(ZFA, TAO)

Page 29: Bringing reason to phenotype diversity, character change, and common descent

KB is based on OBD(Ontology-Based Database)

(C. Mungall, LBL)

Page 30: Bringing reason to phenotype diversity, character change, and common descent

TAO:taxon Phenotype(class expression)

PHENO:exhibits

TAO:entity

OBO_REL:inheres_in

OBO_REL:towards

TAO:entity

CDAO:has_TU

CDAO:CharacterStateDataMatrix

PHENO:has_publication

CDAO:has_CharacterCDAO:Character

name = character text

CDAO:CharacterStateDomain

name = state text

CDAO:has_Datum

PHENO:Publication-dc:abstract

-dc:bibliographicCitation-dc:date

OBO_REL:posited_by

OBO_REL:posited_by

has_measurement

Measurement-value/max/min

-unit

has_evidence

ECO:evidence

publication notes (literal text)

PHENO:has_comment

dc:creator

curator(s)

comment (literal text)

PHENO:has_comment

comment (literal text)

PHENO:has_comment

CDAO:CharacterStateDatum CDAO:has_State

CDAO:TU

name = Publication Taxon

PHENO:has_taxon

comments (literal text)

PHENO:has_comment

Specimen

dwc:catalogID

dwc:collectionIDCOLLECTION

catalog number (literal text)

dwc:individualID

PHENO:asserted_for_otu

PATO:quality

OBO_REL:is_a

GenotypeOBO_REL:influences

Gene

OBO_REL:variant_of

ZFIN:Publication

uid = ZFIN ID

OBO_REL:posited_by

Page 31: Bringing reason to phenotype diversity, character change, and common descent

Phenoscape OBD reasoner• All OBD built-in rules:

• is_transitive(R), X R Y, Y R Z → X R Z• is_reflexive(R) → X R X• X is_a Y, Y R some Z → X R some Z• Y is_a Z, X R some Y → X R some Z• X R Y, Y S Z, transitive_over(R,S) → X

R Z• X R1 Y, Y R2 Z,

holds_over_chain(R,R1,R2) → X R Z

• Additional Phenoscape rule:

• T exhibits P, T is_a T’ → T’ exhibits P• Consistent with OWL due to instance

quantification

Page 32: Bringing reason to phenotype diversity, character change, and common descent

Major taxonomic groups have similar distribution of entities

among phenotypes

Page 33: Bringing reason to phenotype diversity, character change, and common descent

Substantial overlap between model organism and

evolutionary phenotypes

0  500  1000  1500  2000  2500 

nervous system 

sensory system 

skeletal system 

cardiovascular system 

diges7ve system 

immune system 

endocrine system 

renal system 

respiratory system 

liver and biliary system 

musculature system 

reproduc7ve system 

hematopoie7c system 

Evolu7onary characters 

Zebrafish phenotypes 

•4,217 zebrafish phenotypes•3,405 evolutionary characters

Page 34: Bringing reason to phenotype diversity, character change, and common descent

Hypothesis generation:Genetic basis for scale loss in

Siluriformes

Harris et al., 2007

Mutation of eda gene in Danio:

Copyright  ©  Jean  Ricardo  Simões  Vitule,  All  Rights  Reserved

Ictalurus punctatus:

Page 35: Bringing reason to phenotype diversity, character change, and common descent

Hypothesis generation:Genetic basis for absence of the basihyal bone in Siluriformes

Mutation of brpf1 gene in Danio:

Ictalurus punctatus:

Laue et al (2008)

Page 36: Bringing reason to phenotype diversity, character change, and common descent

Attribute Example Qualities

Color black, colorless

Composition cartilaginous, ossified

Count present, absent

Position horizontal, vertical

Quality open, closed, flexible

Relational Shape protruding into

Making PATO usable for evolutionary data

Attribute Example Qualities

Relational Spatial anterior to, lateral to

Relational Structural fused with, overlap with, separated from

Shape concave, interdigitated, lobed, triangular

Size increased length, decreased length

Texture wrinkled, smooth

Page 37: Bringing reason to phenotype diversity, character change, and common descent

Getting PATO right is a challenge • PATO is “single-inheritance” - what is the right

axis of classification?

• relational shape vs monadic shape• relational spatial vs position• shape and size vs natural language

”Interopercle shape: expanded posteroventrally”

• Different ways to observe or generate a phenotypic quality

• Color as color hue (radiation quality) or pigmentation (structural quality)

• Relative sizes don’t have a universal reference

• Negation (“not round”, “unelongated”): means complement under attribute ‘(shape and not(round))’

Page 38: Bringing reason to phenotype diversity, character change, and common descent

Mapping EQs back to characters is a challenge

• Properties of “good” phylogenetic characters:

• Exclusivity of states

• Distinguishability of states

• Independence of characters

• Finding exclusive states requires incompatible phenotypes. How to determine incompatibility?

• Two phenotypes are incompatible iff they cannot both inhere in the same specimen.

• Two qualities are incompatible iff an entity cannot bear both.

Page 39: Bringing reason to phenotype diversity, character change, and common descent

Which EQs and qualities are incompatible?

• Incompatible Qs

• present vs. absent• triangular vs.

round• absent vs. any

other quality

• Incompatible EQs

• (Q inheres_in bone E) vs (cartilage E absent)

• Compatible Qs

• present vs. any other quality (except absent)

• serrated vs. round• some colors

Page 40: Bringing reason to phenotype diversity, character change, and common descent

Hemiodus argenteus

Hemiodus unimaculatus

{shape:bent inheres_in supraorbital bone,count:absent inheres_in upper pharyngeal 5 tooth}

{shape:straight inheres_in supraorbital bone,count:present inheres_in upper pharyngeal 5 tooth}

Hemiodus{shape:bent inheres_in supraorbital bone,shape:straight inheres_in supraorbital bone, count:absent inheres_in upper pharyngeal 5 tooth, count:present inheres_in upper pharyngeal 5 tooth}

Detecting phenotype change and variation

{Change in: shape inheres_in supraorbital bone,Change in: count inheres_in upper pharyngeal 5 tooth}

Page 41: Bringing reason to phenotype diversity, character change, and common descent

Phenotypic profile: Phenotypic profile tree

Phenotype match

Taxon color indicates the greatest level of match of specified phenotype(s) found within a species in the clade.Phenotypes

including parts

including parts

including parts

X

X

Xopercle triangular

adipose fin absent

dorsal fin absent

Include inferred phenotypes

100%

75%

50%

<50%

Query taxa with these phenotypes.

Visualizing phenotype profiles on a tree

Page 42: Bringing reason to phenotype diversity, character change, and common descent

triangular 1 1

Y-shaped 1 1

shape 8 3

Taxa Phenotype Profiles

Phenotype

Taxonomic distribution ofentity term: basihyal bone

for basihyal bone

Show taxa with unspecified shape phenotypes

Show taxa without phenotype data

Limit tree to Cypriniformes X

or Sets of taxa with matching Phenotype Profiles

Cypriniformes

Cyprinidae

Balitoridae

Psilorhynchidae

Catostomidae

Botiidae

Vaillantellidae

Cobitidae

Gyrinocheilidae

Osariophysii

Navigating phenotype variation on a tree

Page 43: Bringing reason to phenotype diversity, character change, and common descent

Reasoning over

homology

image by Kyle Luckenbill, ANSP

Entity 1  Taxon 1  Relationship  Entity 2  Taxon 2  Evidence  Reference(s) 

scaphium  Otophysi  homologous_to  neural arch 1  Teleostei   IDS, IMS, IPS  

(Fink and Fink, 

1981; Rosen and 

Greenwood, 

1970) 

intercalarium  Otophysi  homologous_to 

neural arch 2 

(ventral 

portion) 

Teleostei  IDS, IMS, IPS  

(Rosen and 

Greenwood, 

1970) 

intercalarium  Otophysi  homologous_to  neural arch 2   Teleostei  NAS (Fink and Fink, 

1981) 

intercalarium  Otophysi  homologous_to  neural arch 2   Teleostei  IMS  (Hora, 1922) 

intercalarium  Otophysi  homologous_to  rib of vertebra 2  Teleostei  TAS   (Hora 1922) 

tripus  Otophysi  homologous_to parapophysis + 

rib of vertebra 3 Teleostei   IDS, IMS, IPS  

(Fink and Fink, 

1981; Rosen and 

Greenwood, 

1970) 

Page 44: Bringing reason to phenotype diversity, character change, and common descent

Formalizing homology relationships

• Formal pattern is ternary:E1 in_taxon T1 homologous_to E2 in_taxon T2 as E3 in_taxon T3

• Classifying homology relationships

• 1-1 homology (phylogenetic homology)

• serial homology

• A iso_homologous_to B as C⇒ all A derived_by_descent_from some (C and has_derived_by_descendent some B) and all B derived_by_descent_from some (C and has_derived_by_descendent some A)

• shares_ancestor_with as a relation chain:derived_by_descent_from o has_derived_by_descendent

Page 45: Bringing reason to phenotype diversity, character change, and common descent

Option 1: Asserting homology at higher-level taxa

Page 46: Bringing reason to phenotype diversity, character change, and common descent

Option 2: Asserting homology at species level

Page 47: Bringing reason to phenotype diversity, character change, and common descent

Validation through standard OWL-DL reasoning

Page 48: Bringing reason to phenotype diversity, character change, and common descent
Page 49: Bringing reason to phenotype diversity, character change, and common descent

Opening descriptive biological data to computing can enable new science

Descriptive biology

- Phenotypes- Traits

- Function- Behavior- Habitat

- Life Cycle- Reproduction

- Conservation Threats

Biodiversity(Specimens, Occurrence

records)

Taxonomy, Species ID

Conservation Biology

Ecology

Genetics

Genomics, Gene

expression

Genetic variation

Physiology

Page 50: Bringing reason to phenotype diversity, character change, and common descent

Acknowledgements

• PhenoscapePersonnel & PIs:P. Mabee,M. Westerfield, T. Vision,J. Balhoff,C. Kothari,W. Dahdul,P. Midford

• Phenoscape curators & workshop participants

• Berkeley Bioinformatics & Ontologies Project (BBOP):C.Mungall, S.Lewis

• National Evolutionary Synthesis Center (NESCent)

• NSF (DBI 0641025)