Bioinformatics BIO520/INF520 Jim Lund Assigned reading: Ch1 & 2

Post on 11-Jan-2016

23 views 1 download

Tags:

description

Bioinformatics BIO520/INF520 Jim Lund Assigned reading: Ch1 & 2. Bioinformatics. - PowerPoint PPT Presentation

Transcript of Bioinformatics BIO520/INF520 Jim Lund Assigned reading: Ch1 & 2

Bioinformatics

BIO520/INF520

Jim Lund

Assigned reading:

Ch1 & 2

Bioinformatics applies principles of information science (derived from applied math, computer science, and statistics) to make the vast, diverse, and complex life sciences data more understandable and useful. It automates simple but repetitive types of analysis.

Computational biology uses mathematical and computational approaches to address theoretical and experimental questions in biology.

Bioinformatics

BIO520 Topics

• Navigating biological databases.• Sequence alignment.• Proteins

- 3D structure visualization, prediction, motif analysis.

• DNA sequence annotation.– Gene finding in prokaryotes and eukaryotes.

• RNA structure. • Phylogenetic inference• Genome/transcriptome/proteome

– Function & Analyses.

Molecular information-DNA

• Raw bacterial DNA sequence– Coding or not?– Parse into genes?– Find regulatory

sequences?– PCR primers, vector

engineering?

– 4 bases: ACGT• 1kb for a gene• Mb for a genome

http://www.ncbi.nlm.nih.gov/Genbank/genbankstats.html

Growth of Genbank (1982-2009)

0

10

20

30

40

50

60

70

80

90

100

110

1982198319841985198619871988198919901991199219931994199519961997199819992000200120022003200420052006200720082009

Sequences (millions)

0

20

40

60

80

100

120

Base pairs (billions)

x

Protein Structure Prediction

Proteomics

1978-1998

MALDI-TOF?ESI-MS?

Metabolic Networks

KEGG, 1998

Regulatory Networks

KEGG

Bioinformatics-what is it?Acquisition, curation, and analysis of

biological data

Hypothesis

Bioinformatic Data-1978 to 2008

• DNA sequence• Gene expression• Protein

expression• Protein Structure• Genome mapping

• Metabolic networks

• Regulatory networks

• Trait mapping• Gene function

analysis• Scientific literature

Goals of the HGP,1998-2003

• Reference Human Genome Sequence• Draft 2001, Finished in 2003

• Improved Sequence Technology• $0.25 per finished base

• Human Genome Sequence Variation• Technology for Functional Genomics• Comparative Genomics

• Finish Mouse by 2005 (well ahead here)

• ELSI

Genome sequences highlight the finiteness of the set of sequences!

What remains to be done?• Comparative

Genomics• Description of

mRNAs, proteins (identity and structure)

• Functional analysis

• Detailed understanding of development, regulation, variation

The Gene for…

Other Reasons to Care

Genentech

Affymetrix

Biologist User Training

• Internet sites–Range from high quality to unreliable.

•Unread documentation•Popular program sites with NO documentation–Perhaps one day I will get around to writing some documentation”-

–Help from a WWW service, hit several hundred times per day!

Dramatic Changes in Information Science

• Information Storage– Digital: text, numbers, images

• Computerized Data Analysis

• Automated Data Analysis

• Information Distribution– Internet, cloud, etc.

Moore’s Law

Intel Corporation

Computer Science and bioinformatics

• Operating Systems

• Programming

• Algorithms

– New problems keep turning up!

• Data structure/databases

• Interfaces

• Search and visualization

BIO520 Nuts and Bolts

• Syllabus & Schedule

• Textbook– Internet– Program

documentation

• Labs on FridaysIn Young B-35

• Exams (2 + final)

• Grading:– 12 labs: 10 pts

– Exams: 50 pts

– Final: 50 pts

http://elegans.uky.edu/520

TextbooksRequired textbook:• Understanding Bioinformatics by

Marketa Zvelebil and Jeremy Baum

Supplemental reading (don’t buy):• Bioinformatics: A Practical Guide to

the Analysis of Genes and Proteins, 3rd Ed.– Baxevanis and Ouellette

Biology background material:– Genes IX (Lewin)– Cell Biology (Watson et al, Darnell et al) – NCBI Bookshelf

(http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=Books&itool=toolbar)

Computer Resources

• http://elegans.uky.edu/520• Locally installed Programs:

– Cn3D, Clustal, TreeView, Chime

• Web based tools:– Databases

– Software programs

Biological Principles

Evolution by natural selection

DNA->RNA->Protein

StructureFunction