Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of...

76
Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth Medical School
  • date post

    21-Dec-2015
  • Category

    Documents

  • view

    221
  • download

    5

Transcript of Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of...

Page 1: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

Bioinformatics Databases:Getting Knowledge from

Information

Kristen AntonDirector of BioInformaticsDartmouth Medical School

BioInformatics @ Dartmouth Medical School

Page 2: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

What is Bioinformatics?

Bioinformatics provides the backbone computational tools, databases and domain

expertise that facilitates modern biomedical, biological and genomic research.

BioInformatics @ Dartmouth Medical School

Page 3: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

BioInformatics @ Dartmouth Medical School

What is Bioinformatics?

• ‘Wet-lab’ science• Sequence analysis• Modeling & structural work• Algorithm development• Clinical and Translational research• Hardware & software infrastructure

The expertise is multidisciplinary,and the skills fall on a continuum from

‘pure’ science to ‘pure’ computing:

Page 4: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

With a field this extensive and skill sets so varied, where do we begin?

BioInformatics @ Dartmouth Medical School

Page 5: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

BioInformatics @ Dartmouth Medical School

From Information Design, Nathan Shedroff

Page 6: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

Moving from Information to Knowledge to Understanding:

Genetic testing

• BRCA1 and BRCA2 gene mutations: what is the real risk to women carriers? 25% - 80%

• Huntington’s Disease: mechanism defined, but what does that mean for the individual in terms of age of onset, severity of disease, or how disease will progress?

BioInformatics @ Dartmouth Medical School

Page 7: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

How can Bioinformatics facilitate the extraction of

information?• Development of tools that support laboratory

experiments• Design, implementation and integration of

biological databases• Development of various analytical tools, algorithms

and models• Development of systems to collect, validate,

manage and integrate clinical and research data to facilitate translational research

BioInformatics @ Dartmouth Medical School

Page 8: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

Bioinformatics will not replace experiments, but can greatly

streamline and enable the discovery process.

BioInformatics @ Dartmouth Medical School

Page 9: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

One of the fundamental toolsof bioinformatics: Database

• A database is a body of information stored in two dimensions (rows and columns)

• The power of the database lies in the relationships that you construct between the pieces of information (tables)

• SQL (Structured Query Language) - interactive and embedded

• Good design and application ensure data integrity• Interoperability

BioInformatics @ Dartmouth Medical School

Page 10: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

Industry Challenge #1:Genome annotation

The Human Genome is sequenced. It is estimated that 2% of the human genome codes for genes.

The function of the remaining 98% (non-coding regions) is largely unknown but likely include providing chromosomal structural integrity and regulating where, when, and in what quantity proteins are made.

BioInformatics @ Dartmouth Medical School

Page 11: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

What does the genome data look like? 1 gcggagggtg cgtgcgggcc gcggcagccg aacaaaggag caggggcgcc gccgcaggga 61 cccgccaccc acctcccggg gccgcgcagc ggcctctcgt ctactgccac catgaccgcc 121 aacggcacag ccgaggcggt gcagatccag ttcggcctca tcaactgcgg caacaagtac 181 ctgacggccg aggcgttcgg gttcaaggtg aacgcgtccg ccagcagcct gaagaagaag 241 cagatctgga cgctggagca gccccctgac gaggcgggca gcgcggccgt gtgcctgcgc 301 agccacctgg gccgctacct ggcggcggac aaggacggca acgtgacctg cgagcgcgag 361 gtgcccggtc ccgactgccg tttcctcatc gtggcgcacg acgacggtcg ctggtcgctg 421 cagtccgagg cgcaccggcg ctacttcggc ggcaccgagg accgcctgtc ctgcttcgcg 481 cagacggtgt cccccgccga gaagtggagc gtgcacatcg ccatgcaccc tcaggtcaac 541 atctacagtg tcacccgtaa gcgctacgcg cacctgagcg cgcggccggc cgacgagatc 601 gccgtggacc gcgacgtgcc ctggggcgtc gactcgctca tcaccctcgc cttccaggac 661 cagcgctaca gcgtgcagac cgccgaccac cgcttcctgc gccacgacgg gcgcctggtg 721 gcgcgccccg agccggccac tggctacacg ctggagttcc gctccggcaa ggtggccttc 781 cgcgactgcg agggccgtta cctggcgccg tcggggccca gcggcacgct caaggcgggc 841 aaggccacca aggtgggcaa ggacgagctc tttgctctgg agcagagctg cgcccaggtc 901 gtgctgcagg cggccaacga gaggaacgtg tccacgcgcc agggtatgga cctgtctgcc 961 aatcaggacg aggagaccga ccaggagacc ttccagctgg agatcgaccg cgacaccaaa ...

Multiply times eighteen million

BioInformatics @ Dartmouth Medical School

Page 12: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

What does the genome annotation look like today?

BioInformatics @ Dartmouth Medical School

Page 13: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

BioInformatics @ Dartmouth Medical School

Page 14: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

The value of a genome is onlyas good as its annotation

• Two steps: annotation & curation• Each genome is annotated individually• Manual curation is standard practice• New tools, ie. NCBI Mapviewer,

ESTAnnotator, NCBI Annotation Pipeline• Many databases available …

BioInformatics @ Dartmouth Medical School

Page 15: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

Nucleic Acids Research article lists1078 public databases (up from 719 in

2005):

BioInformatics @ Dartmouth Medical School

Nucleic Acids Research, 2008, Vol. 36, Database issuehttp://nar.oxfordjournals.org/cgi/reprint/36/suppl_1/D2

Page 16: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

BioInformatics @ Dartmouth Medical School

Growth in Available Bioinformatics Databases

Page 17: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

Industry Challenge #2:Too much unintegrated data

• Data sources incompatible

• No (or few) standard naming convention

• No common interface (varying tools for browsing, querying and visualizing data)

BioInformatics @ Dartmouth Medical School

Page 18: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

Public Data Resources• “Mandatory” sequence submissions• Cover enormously wide range of informational

topics• Broad (sequence) to very specific (proteins

associated with tooth decay) issues• No standard database format: poor

interoperability, difficulty with integration• Ongoing efforts to address annotation problem

BioInformatics @ Dartmouth Medical School

Page 19: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

NCBI Database Resources

BioInformatics @ Dartmouth Medical School

http://www.ncbi.nlm.nih.gov/gquery/gquery.fcgi?term=uniprot

Page 20: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

Major Sequence Repositories

• GenBank All known nucleotide and protein sequences; International Nucleotide Sequence Database Collaboration

• EMBL Nucleotide Sequence Database All known nucleotide and protein sequences; International Nucleotide Sequence Database Collaboration

• DNA Data Bank of Japan (DDBJ) All known nucleotide and protein sequences; International Nucleotide Sequence Database Collaboration

• TIGR/J. Craig Venter Institute Non-redundant, gene-oriented clusters (and many curated microbial genome databases)

• UniGene Non-redundant, gene-oriented clusters

BioInformatics @ Dartmouth Medical School

Page 21: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.
Page 22: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

Entrez Gene: a unified queryenvironment for genes defined by

sequence

BioInformatics @ Dartmouth Medical School

• Summary/descriptive information• Pubmed entries/bibliography• Interactions• NCBI Reference Sequences (Refseq)• Related sequences• Pathways• Ontologies• Additional likes (e.g. UniGene reference)

Page 23: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

GenBank

BioInformatics @ Dartmouth Medical School

Page 24: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

GenBank Growth

BioInformatics @ Dartmouth Medical School

Page 25: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

GenBank Growth

• 1982 Database contains 606 sequences• Feb 2008 release notes: Database contains

more than 82 million sequences - 82853685 (the number of bases approximately doubles every 18 months)

• 240,000 different species represented, with new species added at rate of 2900/month

• 16% of sequences are of human origin, 13% are human ESTs

BioInformatics @ Dartmouth Medical School

Page 26: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

Potential Errors in GenBank

• Sequence errors estimated at between 0.37 and 35 (!) errors per 1000 bases

• Recombination• Contamination• Annotation errors - propagated misannotations

– Transfer by similarity is problematic– Errors not always corrected in a timely way– Genes with varying unrelated functions depending on

context– Functional annotation is often unsystematic

• Name-function disconnect

BioInformatics @ Dartmouth Medical School

Page 27: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

Potential Errors in GenBank

• Naming conflicts– One gene, many acronyms– Many genes, shared acronym– Spelling errors– Cultural differences (US, UK)– Representation of non-ASCII characters

BioInformatics @ Dartmouth Medical School

Page 28: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

BioInformatics @ Dartmouth Medical School

Page 29: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

BioInformatics @ Dartmouth Medical School

Page 30: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

BioInformatics @ Dartmouth Medical School

Page 31: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

BioInformatics @ Dartmouth Medical School

Page 32: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

BioInformatics @ Dartmouth Medical School

Also known as ACTR; AIB1; RAC3; SRC3; pCIP; AIB-1; CTG26; SRC-1; CAGH16; KAT13B; TNRC14; TNRC16; TRAM-1; MGC141848

Page 33: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

Many Databases available:• Comparative Genomics• Gene Expression• Gene Identification & structure• Genetic Maps• Genomic Databases• Intermolecular Interactions• Metabolic Pathways and Cellular Regulation• Mutation Databases• Pathology• Protein Databases• Protein Sequence Motifs• Proteome Resources• Retrieval Systems & Database Structure• RNA Sequences• Structure• Transgenics• Varied Biomedical Content

BioInformatics @ Dartmouth Medical School

Page 34: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

The principal requirementson the public data services

• Data quality - data quality has to be of the highest priority. However, because the data services in most cases lack access to supporting data, the quality of the data must remain the primary responsibility of the submitter. Gene Expression

• Supporting data - database users will need to examine the primary experimental data, either in the database itself, or by following cross-references back to network-accessible laboratory databases. Genetic Maps

• Deep annotation - deep, consistent annotation comprising supporting and ancillary information should be attached to each basic datat object in the database. Intermolecular Interactions

• Timeliness - the basic data should be available on an Internet-accessible server within days (or hours) of publication or submission.

• Integration - each data object in the database should be cross-referenced to representation of the same or related biological entities in other databases. Data services should provide capabilities for following these links from one database or data service to another.

BioInformatics @ Dartmouth Medical School

Page 35: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

Comparative Genomics: COG

• Clusters of Orthologous Groups of proteins (COGs) were delineated by comparing protein sequencesencoded in 66 complete genomes, representing 38 major phylogenetic lineages.

• Each cluster corresponds to an ancient conserved domain.

BioInformatics @ Dartmouth Medical School

Page 36: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

Gene Expression

BioInformatics @ Dartmouth Medical School

Page 37: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

Genetic Maps

BioInformatics @ Dartmouth Medical School

Page 38: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

Genomic Databases

BioInformatics @ Dartmouth Medical School

Page 39: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

Intermolecular Interactions

BioInformatics @ Dartmouth Medical School

Page 40: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

Metabolic Pathways and Celluar Regulation

BioInformatics @ Dartmouth Medical School

Page 41: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

Mutation Databases

BioInformatics @ Dartmouth Medical School

Page 42: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

Pathology

BioInformatics @ Dartmouth Medical School

Page 43: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

Protein Databases

BioInformatics @ Dartmouth Medical School

Page 44: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

Protein Databases: Swiss-Prot• Extremely well curated

protein database

• Link to BLAST

• Powerful cross-references

• Est. 1986

• Maintained by the Department of Medical Biochemistry of the University of Geneva and the EMBL Data Library

BioInformatics @ Dartmouth Medical School

Page 45: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

Proteome Resources: Proteome BKL

BioInformatics @ Dartmouth Medical School

Page 46: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

RNA Sequences

BioInformatics @ Dartmouth Medical School

Page 47: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

Structure

BioInformatics @ Dartmouth Medical School

Page 48: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

Varied Biomedical Content

BioInformatics @ Dartmouth Medical School

Page 49: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

Extinct: Gene Identification & Structure

BioInformatics @ Dartmouth Medical School

Page 50: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

National Center for Biotechnology Information (NCBI):

A network of linked resources• Database access: Genbank,

structure, function, SNP, taxonomy...

• Literature (PubMed)• Whole genomes• Tools• Contacts & research

information• FTP

BioInformatics @ Dartmouth Medical School

Page 51: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

NCBI resources

• Nucleotide databases• Protein databases• Structure databases• Taxonomy databases• Genome databases• Expression databases

BioInformatics @ Dartmouth Medical School

Page 52: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

BioInformatics @ Dartmouth Medical School

A word about caBIG:Cancer Biomedical Informatics Grid

• NCI (National Cancer Institute) effort to align NCI-funded Cancer Centers

• Definition: a voluntary network or grid connecting individuals and institutions to enable the sharing of data and tools, creating a www of cancer research

• Goal: to speed up the delivery of innovative approaches for the prevention and treatment of cancer

• Leadership: NCI Center for Bioinformatics

Page 53: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

BioInformatics @ Dartmouth Medical School

The real fun:Clinical and Translational Research

• Clinical Trials, Prevention studies, Molecular Epidemiology studies

• Integrating epidemiological, clinical, outcome and molecular data

• Cross-disciplinary: requires expert technical skill (process engineering, system design, software development, data mgt & integration), scientific expertise (genetics, genomics, proteomics)

Page 54: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

BioInformatics @ Dartmouth Medical School

Page 55: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

BioInformatics @ Dartmouth Medical School

Chemoprevention study: Prevention of skin cancer by antioxidants

• Subjects (5000) with clinical evidence of chronic arsenic exposure will be recruited from two ongoing cohort studies

• Two study sites in Araihazar and Matlab; Coordinating center at Dhaka and University of Chicago; Data center at Dartmouth

• Goal: to test effectiveness of vitamin E and selenium in preventing development of skin cancer

Page 56: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

BioInformatics @ Dartmouth Medical School

Data issues

• Design processes, forms and systems• Collect: Family history, Risk factors, Health &

outcomes, bio-specimens/related data• Screening, recruitment, pill distribution, data

collection, bio-specimen tracking, data management, data integration and analysis

• Challenges: intermittent power, low bandwidth internet access, low-tech facilities (no land-line phone, no fax, no cooling, no computers at sites), translation of information into and back from Bengali – etc.

Page 57: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

BioInformatics @ Dartmouth Medical School

Page 58: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

BioInformatics @ Dartmouth Medical School

Page 59: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

BioInformatics @ Dartmouth Medical School

Page 60: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

BioInformatics @ Dartmouth Medical School

Page 61: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

BioInformatics @ Dartmouth Medical School

Page 62: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

BioInformatics @ Dartmouth Medical School

Page 63: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

BioInformatics @ Dartmouth Medical School

Page 64: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

BioInformatics @ Dartmouth Medical School

Page 65: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

BioInformatics @ Dartmouth Medical School

Mars, Palm trees & Cancer Biomarkers?

                             

                                                         

Page 66: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

BioInformatics @ Dartmouth Medical School

Dartmouth – NASA JPL collaborationLife Sciences Data

                             

                                                         

Page 67: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

BioInformatics @ Dartmouth Medical School

Early Detection Research Network: Biomarker Atlas Project

• EDRN is an NCI initiative to discover and validate individual or panels of biomarkers for the early detection of cancers

• Organ-based research groups, international team• Informatics center at JPL – novel and

sophisticated infrastructure• First step: EDRN Resource Network Exchange

(ERNE) – Dartmouth collaboration - 2002

Page 68: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

eCASScience Warehouse

CDE Repository

ERNE

VSIMS

Participant DB

Protocol DB

Public Portal

Distributed SpecimenDatabases

EDRN science data results (local, distributed and varying

degrees of validation)

Descriptions of biomarkersand their use (protocol_id)

Descriptions of EDRN studies-Participants-Specimen tracking, etc

Protocols and theirdescriptions

Data elements and their descriptions

BIOINFORMATICSTOOLS

EDRN science data results

(protocol_id,participant_id)

(protocol_id,participant_id)

(protocol_id,participant_id)

Biomarker_DB

(protocol_id)

Participants and their

characteristics

EDRN Knowledge Environment

Page 69: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

BioInformatics @ Dartmouth Medical School

EDRN Biomarker Atlas Project:Lung Prototype

Page 70: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

BioInformatics @ Dartmouth Medical School

EDRN Biomarker Atlas ProjectColon Models

Page 71: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

BioInformatics @ Dartmouth Medical School

Page 72: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

BioInformatics @ Dartmouth Medical School

Page 73: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

BioInformatics @ Dartmouth Medical School

Page 74: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

BioInformatics @ Dartmouth Medical School

Barro Colorado – Smithsonian Tropical Research Institute

Page 75: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

BioInformatics @ Dartmouth Medical School

Page 76: Bioinformatics Databases: Getting Knowledge from Information Kristen Anton Director of BioInformatics Dartmouth Medical School BioInformatics @ Dartmouth.

BioInformatics @ Dartmouth Medical School