Introduction to Bioinformatics - Shandong University › Download2 ›...

125
1 Introduction to Bioinformatics Dr. rer. nat. Gong Jing Cancer Research center Medicine School of Shandong University 2012.10.31 Introduction to Introduction to Bioinformatics Bioinformatics

Transcript of Introduction to Bioinformatics - Shandong University › Download2 ›...

Page 1: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

1

Introduction to Bioinformatics

Dr. rer. nat. Gong Jing

Cancer Research center

Medicine School of Shandong University

2012.10.31

Introduction to Introduction to BioinformaticsBioinformatics

Page 2: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

2

Chapter 1

Introduction

Introduction to Introduction to BioinformaticsBioinformatics

Page 3: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

3

About me• Dr. rer. nat. Gong Jing • Bachelor Degree in Marine Biology at the China

Ocean University (former Qingdao Ocean University)

• Bachelor, Master & Doctoral Degree in Bioinformatics at the Ludwig MaximiliansUniversität München, Germany

• Affiliation: Cancer Research Center of SDU• Tel: 0531-88380202• Email: [email protected]• Office: Dianjing Building, Rm.106, Baotuquan

Campus

Introduction to Introduction to BioinformaticsBioinformatics

Page 4: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

4

About this course• Schedule: 2012/10/31 – 2012/11/14, Wed. & Fri. 13:30 - 17:30• Locus: 8#, third floor, east, Computer Pool • Homepage: http://www.crc.sdu.edu.cn/bioinfo/2012

• Table of Contents

Chapter 1 : Introduction Chapter 2 : Databases

Chapter 5 : Structure

Chapter 3 : Alignment Chapter 4 : Tree

My name is Lampy.

Introduction to Introduction to BioinformaticsBioinformatics

Page 5: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

5

Literatures:1. Bioinformatics - An Introduction, 2nd Edition, Jeremy Ramsden, 2009, Springer. 2. Bioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

Introduction to Introduction to BioinformaticsBioinformatics

Page 6: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

6

Literatures:1. 生物信息学,陈铭主编,2012-1-1,科学出版社. 2. 生物信息学,李霞主编,2010-8-1,人民卫生出版社3. 生物信息学,许忠能主编,2008-9-1,清华大学出版社

Introduction to Introduction to BioinformaticsBioinformatics

Page 7: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

7

Information Page Vocabulary ListInformation Page

Chapter 1, 2012/10/31

Dr. rer. nat. Jing GongAffiliation: Cancer Research Center of SDUTel: 0531-88380202Email: [email protected]: Dianjing Building, Rm.106, BaotuquanCampus

Schedule: 20112/10/31 - 2012/11/14Wed. & Fri. 13:30 - 17:30

Place: 8#, third floor, east, Computer Pool

Course Homepage: http://www.crc.sdu.edu.cn/bioinfo/2012

Pubmed: http://www.ncbi.nlm.nih.gov/entrez/

ExPASy: http://expasy.org/

NCBI: http://www.ncbi.nlm.nih.gov/

PRI: http://pir.georgetown.edu

FASTA

FASTA (prounced FAST-Aye) stands forFAST-ALL, reflecting the fact that it canbe used for a fast protein ……

BLAST

Basic Local Alignment Search Tool. A sequence comparison algorithm optimized for speed used to search sequence dtabases ……

Alignment

The result of a comparison of two or more gene or protein sequences in order to determine their degree of base or amino acid…….

FASTA

FASTA (读作FAST-Aye) 代表FAST-ALL, 反映的实施是他能够用于快速的蛋白质比对或者快组的核苷比对。该程序……

BLAST

基本局部比对搜索工具。以速度优化算法为核心,搜索序列数

据库得到 佳局部比对结果。用替代矩阵和查新序列……

比对

两个甚至更多的基因或者蛋白质序列进行比较的结果,用以计算他们碱基或者氨基酸的相似度。序列比对用来决定两个甚至…….

Vocabulary

Chapter 1, 2012/10/31

Introduction to Introduction to BioinformaticsBioinformatics

Page 8: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

8

What is Bioinformatics?

biochemistry

biometrics

biophysics biohazards

biomathematics

bioterrorism

biopotato bioinformatics

Introduction to Introduction to BioinformaticsBioinformatics

Page 9: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

9

a biology/medical researchers, just like you

a professional in the pharmaceutical industry

a policeman worrying about DNA testing

a computer scientist developing bio-databases

a consumer concerned about GMOs (Genetically Modified Organisms)

… …

Introduction to Introduction to BioinformaticsBioinformatics

InterdisciplinaryWhat is Bioinformatics?

Page 10: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

10

DEFINITION:Bioinformatics – the science of collecting and analyzing complex biological data such as genetic codes. [Oxford Dictionary]

Bioinformatics – the computational branch of molecular biology. [Bioinformatics for Dummies]

Bioinformatics – the application of computer science and information technology to the field of biology and medicine. [Wikipedia]

Bioinformatics – the science of how information is generated, transmitted, received, and interpreted in biological systems, i.e. the application of information science to biology. [Bioinformatics-An Introduction]

A formal definition ?

Introduction to Introduction to BioinformaticsBioinformatics

What is Bioinformatics?

Page 11: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

11

History of BioinformaticsIn 1809, French biologist Jean Baptiste Lamarck published “PhilosophieZoologique”. Lamarck stressed two main themes in his biological work:

1. The environment gives rise to changes in animals, i.e. changes through use and disuse.

2. Life was structured in an orderly manner and that many different parts of all bodies make it possible for the organic movements of animals.

“blind as a mole” “show your teeth” “birds have no teeth?” Jean Baptiste Lamarck (1744-1829)

Introduction to Introduction to BioinformaticsBioinformatics

Page 12: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

12

In 1859, English naturalist Charles Darwin published “On the Origin of Species by Means of Natural Selection, or the Preservation of FavouredRaces in the Struggle for Life”.

Charles Darwin (1809-1882)

History of Bioinformatics

Introduction to Introduction to BioinformaticsBioinformatics

Page 13: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

13

Gregor J. Mendel (1822-1884)

In 1866, Austrian scientist Gregor Mendel demonstrated that the inheritance of certain traits in pea plants follows particular patterns, now referred to as the laws of “MendelianInheritance”.

Introduction to Introduction to BioinformaticsBioinformatics

History of Bioinformatics

Page 14: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

14

Friedrich Miescher(1844-1895)

In 1869, Swiss physician and biologist Friedrich Miescher isolated DNA from the white blood cells at Felix Hoppe-Seyler's laboratory at the University of Tübingen, Germany.

Nuclei Nuclein Nucleic acid DNA

Introduction to Introduction to BioinformaticsBioinformatics

History of Bioinformatics

Page 15: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

15

Thomas Hunt Morgan, American geneticist, famous for his experimental research with the fruit fly by which he established the chromosome theory of heredity. He showed that genes are linked in a series on chromosomes and are responsible for identifiable, hereditary traits. Morgan’s work played a key role in establishing the field of genetics. He received the Nobel Prize for Physiology or Medicine in 1933.

Thomas H. Morgen(1866-1945)

Nobel prize 1933

Introduction to Introduction to BioinformaticsBioinformatics

History of Bioinformatics

Page 16: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

16

In 1944, American physician and medical researcher Oswald Avery and his co-workers Colin MacLeod and Maclyn McCarty demonstrated that DNA is the material of which genes and chromosomes are made.

In his experiment he destroyed the lipids, ribonucleic acids, carbohydrates, and proteins. Transformation still occurred after this. Next he destroyed the deoxyribonucleic acid. Transformation did not occur.

Oswald Avery Colin MacLeod Maclyn McCarty(1877-1955) (1909-1972) (1911-2005)

Introduction to Introduction to BioinformaticsBioinformatics

History of Bioinformatics

Page 17: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

17

In 1950, American biochemist Erwin Chargaff noticed a pattern in the amounts of the four bases: adenine (A) , thymine (T) , cytosine (C) , guanine (G). He discovered that the amounts of adenine (A) and thymine (T) in DNA were roughly the same, as were the amounts of cytosine (C) and guanine (G). This later became known as Chargaff's rule.

Erwin Chargaff (1905-2002)

Introduction to Introduction to BioinformaticsBioinformatics

History of Bioinformatics

Page 18: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

18

In 1953, James D. Watson and Francis Cricksuggested the first correct double-helix model of DNA structure in the journal Nature. Their double-helix model of DNA was based on a single X-ray diffraction image taken by Rosalind Franklin andMaurice Wilkins in 1952.

Rosalind Franklin(1920-1958)

James Waston(1928- )

Nobel prize 1962

Francis Crick (1916-2004)

Nobel prize 1962

Maurice Wilkins (1916-2004)

Nobel prize 1962

Introduction to Introduction to BioinformaticsBioinformatics

History of Bioinformatics

Page 19: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

19

The sequence of 77 nucleotides of a yeast alanine tRNA was found by an American biochemist Robert W. Holley in 1965. Holley was awarded the 1968 Nobel Prize in Physiology or Medicine for describing the structure of this tRNA, linking DNA and protein synthesis.

Robert W. Holley (1922-1993)

Nobel prize 1968

Introduction to Introduction to BioinformaticsBioinformatics

History of Bioinformatics

Page 20: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

20

Frederick Sanger (1918- )

Nobel prize 1980

In 1977, Frederick Sanger and Colleagues introduced the “dideoxy” chain-termination method for sequencing DNA molecules, also known as the “Sanger method”. Hence, in 1980, he shared Nobel Prize in chemistry with Walter Gilbert.

Walter Gilbert(1932- )

Nobel prize 1980

The key principle of the Sanger method was the use of dideoxynucleotide triphosphates (ddNTPs), as DNA chain terminators.

Introduction to Introduction to BioinformaticsBioinformatics

History of Bioinformatics

Page 21: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

21

Read protein sequence directly in the DNA sequence!

Central dogma of molecular biology was first demonstrated by Francis Crick in 1958 and re-stated in a Nature paper published in 1970. Francis Crick

(1916-2004)

Introduction to Introduction to BioinformaticsBioinformatics

History of Bioinformatics

Page 22: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

22

Marshall Warren Nirenberg shared a Nobel Prize in Physiology or Medicine in 1968 with Har Gobind Khorana and Robert W. Holley for "breaking the genetic code" and describing how it operates in protein synthesis.

Marshall Warren Nirenberg

(1927-2010)Nobel prize 1968

Har GobindKhorana (1922-)Nobel prize 1968

Robert W. Holley (1922-1993)

nobel prize 1968

Introduction to Introduction to BioinformaticsBioinformatics

History of Bioinformatics

Page 23: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

23

All proteins are made up of the same basic building blocks, called amino acids.

Amino acids are made of carbon, hydrogen, oxygen, nitrogen, and sulfur atoms.

A protein = C1200H2400O600N300S100

Protein is a nutrient needed by the human body for growth and maintenance.

Introduction to Introduction to BioinformaticsBioinformatics

History of Bioinformatics

Aside from water, protein is the most abundant molecule in the body.

Page 24: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

24V

Y

W

T

S

P

F

M

K

L

I

H

G

E

Q

C

D

N

R

A

1-letter

ValineVal20

TyrosineTyr19

TrytophanTrp18

ThreonineThr17

SerineSer16

ProlinePro15

PhenylalaninePhe14

MethionineMet13

LysineLys12

LeucineLeu11

IsoleucineIle10

HistindineHis9

GlycineGly8

Glutamic acidGlu7

GlutamineGln6

CysteineCys5

Aspartic acidAsp4

AsparagineAsn3

ArginineArg2

AlanineAla1

Nmae3-letter# A given type of protein always contains the same number of total amino acids in the same proportion.

Amino acids are linked together as a chain. The first amino acid sequence of a protein, Insulin, was determined in 1951 by Dr. Sanger.

insulin = MALWMRLLPLLALLALWGPDPAAAFVNQHL CGSHLVEALYLVCGERGFFYTPKTRREAEDLQVGQVELGG GPGAGSLQPLALEGSLQKRGIVEQCCTSICSLYQLENYCN

insulin = (30 glycines + 44 alanines + 5 tyrosines + 14 glutamines + . . .)

Frederick Sanger(1918- )

Nobel prize 1958

Introduction to Introduction to BioinformaticsBioinformatics

History of Bioinformatics

Page 25: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

25

Protein Sequence: MAVLD

The first 3D structure of a protein was determined in 1958 by Drs. Kendrewand Perutz, using the complicated technique of X-ray crystallography.

Max Ferdinand Perutz (1914-2002) Nobel prize 1962

John CowderyKendrew (1917-1997)

Nobel prize 1962

Introduction to Introduction to BioinformaticsBioinformatics

History of Bioinformatics

Page 26: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

26

Introduction to Introduction to BioinformaticsBioinformatics

Page 27: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

27

In 1956, Symposium on Information Theory in Biology (Gatlinburg, USA).

In 1979, GenBank was established at Los Alamos National Laboratory (USA).

In 1982, nucleotide sequence database of European Molecular Biology Laboratory (EMBL) was created (Europe).

In 1986, DNA Data Bank of Japan (DDBJ) began data bank activities at NIG (Japan).

in the early 1990s, International Nucleotide Sequence Database Collaboration (INSDC) was founded in cooperation of Genbank/EMBL/DDBJ.

In 1987, a Chinese-American scientist LIN Hua-an first created the word “bioinformatics”. At the very beginning, he created the word “compbio”, then “bioinformatique”, and then “bio-informatics”. But at that time, the email title did not support the hyphen symbol, thus “bioinformatics” was born.

Since at least the late 1980s, the term “bioinformatics” has been primary used in genomics and genetics, particularly in those areas of genomics involving large-scale DNA sequencing.

Introduction to Introduction to BioinformaticsBioinformatics

History of Bioinformatics

Page 28: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

28

Introduction to Introduction to BioinformaticsBioinformatics

History of Bioinformatics

Page 29: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

29

Publicly funded project: Privately funded project

James D. Watson & Francis Collins President Clinton (2000) Craig Venter

1990 began, $3-billion 1998 began, $300-million

patented

freely available

2000 90%

2001 99%

2003 finished

2000 90%

2001 99%

2003 finished

History of Bioinformatics

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

Human Genome Project

29

Page 30: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

30

History of Bioinformatics

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

Page 31: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

31

AB SOLiDTM

4.0 SystemX 27

Illumina HiSeq 2000X 137

Beijing

Shanghai

Shenzhen

History of Bioinformatics

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

Page 32: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

32

Analyzing DNAs

Analyzing RNAs

Analyzing Proteins

Others: Pathway,

Bioimaging

Statistics, etc.

What Bioinformatics Can Do for You?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

Page 33: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

33

1. Read the DNA sequence:ATGGAAGTATTTAAAGCGCCACCTATTGGGATATAAG

2. Decompose it into successive triplets:ATG GAA GTA TTT AAA GCG CCA CCT ATT GGG ATA TAA G . . .

3. Translate each triplet into the corresponding amino acid:M E V F K A P P I G I STOP

Analyzing DNAs

What Bioinformatics Can Do for You?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

Page 34: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

34

ATGGAAGTATTTAA……

MEVFKAP…

DNA

Protein

Database

Analyzing DNAs

What Bioinformatics Can Do for You?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

Page 35: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

35

Analyzing RNAs

In the context of bioinformatics, there are only two important differences between RNA and DNA:

RNA differs from DNA by one nucleotide.

RNA comes as a single strand.

What Bioinformatics Can Do for You?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

Page 36: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

36

Even though RNA molecules consist of single strands of nucleotides, theirnatural urge for pairing with complementary sequences is still there.

Hairpin shapes are the basic elements of RNA secondary structure; they’re made up of loops (the unpaired C-U) and stems (the paired regions).

All transfer RNAs (tRNAs) assemble themselves into a shape like a cloverleaf.

Analyzing RNAs

What Bioinformatics Can Do for You?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

Page 37: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

37

Analyzing ProteinsProtein Structure Determination:

Experimental Methods

Computational MethodsDe novo method, Homology Modeling, Threading, and ensemble method.

X-ray Crystallography Nuclear Magnetic Resonance (NMR)

The first 3D structure of a protein was determined in 1958 using X-ray crystallography.

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

What Bioinformatics Can Do for You?

Page 38: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

38

Maestro

Structure

SequenceVMD

Function

Pymol

Analyzing Proteins

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

What Bioinformatics Can Do for You?

Page 39: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

39

Analyzing Proteins

Drug Design:

• Virtual Screen

• DockingVirtual screening involves the rapid in silico assessment of large libraries of chemical structures in order to identify those structures which are most likely to bind to a drug target, typically a protein receptor or enzyme.

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

What Bioinformatics Can Do for You?

Page 40: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

40

Analyzing ProteinsMolecular dynamics (MD) is a computer simulation of physical movements of atoms and molecules.

Super-computer

500-aa protein, 1 ns (10-9 s), 120 Cores -> 5 hours

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

What Bioinformatics Can Do for You?

Page 41: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

41

Analyzing Protein SequencesBavaria Supercomputing Centre• Linux Cluster: 2007, 753 notes, 5646 cores,

43 Tera Float/s (1 Tera Float/s = 1012 float/s)

• HLRB II: 2007, 9728 cores, 62 Tera Float/s

• SuperMUC: 2012, 140000 cores, 3 Peta Float/s

• 天河一号: 2.5 Peta Float/s, No.1 in the world(1 Pera Float/s = 1015 float/s)

Linux Cluster HLRB II SuperMUC

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

What Bioinformatics Can Do for You?

41

Page 42: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

42

Others: Pathway, Bioimaging, Statistics etc.

CT

magnetic resonance

statistic graph

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

What Bioinformatics Can Do for You?

Page 43: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

43

How Most People Use Bioinformatics?

Making a Multiple Protein Sequence Alignment with ClustalW

Becoming an Instant Expert with PubMed

Retrieving Protein Sequences

Retrieving DNA Sequences

Using BLAST to Compare Your Protein Sequence

Retrieve a 3D protein structure

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

Page 44: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

44

Gene Sequence

Specialistin

Bioinformatics

Great! It’s dUTPase.

But, what’s dUTPase.

Becoming an Instant Expert with PubMed

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

Page 45: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

45

Becoming an Instant Expert with PubMedhttp://www.ncbi.nlm.nih.gov/pubmed

dUTPase

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

Page 46: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

46

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

Becoming an Instant Expert with PubMed

Page 47: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

47

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

Becoming an Instant Expert with PubMed

Page 48: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

48

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

Becoming an Instant Expert with PubMed

Page 49: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

49

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

Becoming an Instant Expert with PubMed

Page 50: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

50

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

Becoming an Instant Expert with PubMed

Page 51: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

51

Author NameAuthor Name

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

Becoming an Instant Expert with PubMed

Page 52: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

52

Author Name + TopicAuthor Name + Topic

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

Becoming an Instant Expert with PubMed

Page 53: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

53

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

Becoming an Instant Expert with PubMed

53

Page 54: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

54

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

Becoming an Instant Expert with PubMed

54

Page 55: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

55

1

2

3

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

Becoming an Instant Expert with PubMed

55

Page 56: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

56

Pubmed ID

PublicationDate

Title

Page

Abstracts

Laboratory address

authors

Internal structure of a database record: Internal structure of a database record:

The information is spread out over The information is spread out over separate sections, called separate sections, called fieldsfields..

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

Becoming an Instant Expert with PubMed

56

Page 57: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

57

Search “Down” in field “Author [AU]”

Search “Down” in field “Title [TI]”

Search “Down” in field “Laboratory address [AD]”

Search “Down”everywhere

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

Becoming an Instant Expert with PubMed

Page 58: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

58

Beijing

Using fields to find experts near you :

Tel : 86 - 10 - 6275-5002 Fax : 86 - 10 - 6276-2292 New Life Science Building, Peking University, Summer Palace Road No. 5, Beijing, P. R. China 100871

1

2

3

BeijingBeijing

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

Becoming an Instant Expert with PubMed

Page 59: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

59

Searching Searching PubMedPubMed using using Advanced SearchAdvanced Search

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

Becoming an Instant Expert with PubMedhttp://www.ncbi.nlm.nih.gov/pubmed

Page 60: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

60

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

Becoming an Instant Expert with PubMed

60

Page 61: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

61

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

Becoming an Instant Expert with PubMed

61

Page 62: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

62

A few more tips about PubMed : How to get the most out of your query:

• quoted queries (for example, “down syndrome”)

• logical connectors: AND, OR, NOT (for example, dUTPase [TI] AND bacteria [TI] NOT Smith [AU])

• initials to proper names (for example, “Abergel C”)

• PubMed Identifier (the number in the PMID field)

How to get the most out of your query:• Names ranking beyond the 10th place in author’s list for older papers (before 1995).

• Papers recorded before 1965.

• Abstracts for most references recorded before 1976.

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

Becoming an Instant Expert with PubMed

Page 63: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

63

acquire some preliminary information about a particular function that you’re interested in — dUTPase.

find out more about it by retrieving a few examples of protein sequencesthat perform this function in E. coli.

Retrieving Protein Sequences http://expasy.org/

ExPASy

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

Page 64: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

64Prof. Amos Bairoch

dUTPase coli

Retrieving Protein Sequences http://expasy.org/

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

64

Page 65: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

65

Retrieving Protein Sequences http://expasy.org/

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

65

Page 66: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

66

Retrieving Protein Sequences http://expasy.org/

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

66

Page 67: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

67

Retrieving Protein Sequences http://expasy.org/

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

67

Page 68: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

68

Retrieving Protein Sequences http://expasy.org/

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

68

Page 69: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

69

1 2 3

Retrieving Protein Sequences http://expasy.org/

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

69

Page 70: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

70

Retrieving Protein Sequences http://expasy.org/

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

70

Page 71: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

71

1 2 3

Retrieving Protein Sequences http://expasy.org/

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

71

Page 72: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

72

Retrieving Protein Sequences http://expasy.org/

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

72

Page 73: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

73

Retrieving Protein Sequences http://expasy.org/

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

73

Page 74: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

74

Retrieving Protein Sequences http://expasy.org/

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

74

Page 75: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

75

TabTab

ExcelExcel

FASTAFASTA

Retrieving Protein Sequences http://expasy.org/

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

75

Page 76: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

76

Retrieving Protein Sequences http://expasy.org/

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

76

Page 77: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

77

Retrieving Protein Sequences http://expasy.org/

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

77

Page 78: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

78

“Cross-references”point to data collections other than UniProtKB.

Retrieving Protein Sequences http://expasy.org/

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

Page 79: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

79

“sequences” provides you with the actual amino acid sequence of the protein.

Save this sequence on your Desktop as “P06968.fasta”.

Retrieving Protein Sequences http://expasy.org/

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

right click

Page 80: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

80

What is FASTA? (has anything to do with PASTA?)

FASTA is the name of a popular sequence alignment and database scanning program created by W.R. Pearson and D.J. Lipman in 1988. Its legacy is the FASTA format which is now ubiquitous in bioinformatics.

The sequence in FASTA format :

>P06968 My_Sequence_NameARCGTCRGCKINTANDRGCKINTANDCKINTANDARCGTCRGCKINTANDRGCKINTAND

The line starting with > (the definition line) contains a unique identifier followed by an optionalshort definition. The lines that follow it contain the DNA or protein sequence (in one-lettercode) until the next > symbol indicates the beginning of a new sequence.

Retrieving Protein Sequences http://expasy.org/

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

Page 81: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

81

acquire some preliminary information about a particular function that you’re interested in — dUTPase.

find out more about it by retrieving a few examples of protein sequencesthat perform this function in E. coli.ExPASy

retrieve DNA sequence relevant to dUTPase protein of E. coli.

Retrieving DNA Sequences

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

Page 82: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

82

Retrieving DNA Sequences http://expasy.org/

P06968

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

82

Page 83: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

83

Retrieving DNA Sequences http://expasy.org/

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

83

Page 84: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

84

Retrieving DNA Sequences http://expasy.org/

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

Page 85: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

85

Retrieving DNA Sequences http://expasy.org/

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

85

Page 86: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

86

Retrieving DNA Sequences http://expasy.org/

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

86

Page 87: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

87

Retrieving DNA Sequences From UniprotKB: P06968 jump to

……

1. Summary Section

2. Reference Section

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

87

Page 88: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

88……

3. Features Section• promoter elements• ribosome binding

sites (RBS)• protein coding

segments (CDS)……

4. Sequence Section

Range of UTPaseORF (CDS)

ORF translation

Retrieving DNA Sequences http://www.ncbi.nlm.nih.gov

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

Page 89: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

89……

1. Summary Section

2. Reference Section

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

Retrieving DNA Sequences http://www.ncbi.nlm.nih.gov

89

Page 90: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

90……

1. Summary Section

2. Reference Section

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

Retrieving DNA Sequences http://www.ncbi.nlm.nih.gov

90

Page 91: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

91

acquire some preliminary information about a particular function that you’re interested in — dUTPase.

find out more about it by retrieving a few examples of protein sequencesthat perform this function in E. coli.

Using BLAST to Compare Sequence

ExPASy

perform a BLAST searchretrieve DNA sequence relevant to dUTPase protein of E. coli.

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

Page 92: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

92

Using BLAST to Compare Sequence

What is BLAST?

BLAST (Basic Local Alignment Search Tool) – A sequencecomparison algorithm optimized for speed used to search sequence databases for optimal local alignments to a query.

BLASTn – BLASTn will search a DNA sequence against a DNA database.

BLASTp – BLASTp will compare a protein sequence against a protein database.

BLASTx – BLASTx will translate a nucleic acid sequence in all six reading frames and compare all these against the protein database of your choice.

BLAST? – BLAST? ……

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

Page 93: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

93

Using BLAST to Compare Sequence http://www.ncbi.nlm.nih.gov/

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

93

Page 94: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

94

Using BLAST to Compare Sequence http://www.ncbi.nlm.nih.gov/

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

94

Page 95: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

95

Open Open ““P06968.fastaP06968.fasta”” at at your Desktop, and paste your Desktop, and paste the sequence here.the sequence here.

Give a name here.Give a name here.

1

2

3

http://www.crc.sdu.edu.cn/bioinfo/2012/P06968.fasta

Using BLAST to Compare Sequence http://www.ncbi.nlm.nih.gov/

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

95

Page 96: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

96

Using BLAST to Compare Sequence http://www.ncbi.nlm.nih.gov/

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

96

Page 97: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

97

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

97

Page 98: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

98

EE--value (form 0 to 1) close value (form 0 to 1) close to 1 is a warning that the to 1 is a warning that the conclusion you might draw conclusion you might draw from the alignments is from the alignments is NOTNOTreliable.reliable.

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

98

Page 99: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

99

to see the alignment between to see the alignment between your query sequence and the your query sequence and the matching sequence of the matching sequence of the protein that corresponds to protein that corresponds to this score.this score.

to see the corresponding to see the corresponding database entry.database entry.

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

99

Page 100: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

100

Using BLAST to Compare Sequence http://www.ncbi.nlm.nih.gov/

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

Page 101: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

101

What is Alignment?

Alignment is the result of a comparison of two or more gene or protein sequences in order to determine their degree of base or amino acid similarity.

Pairwise Alignment

Multiple Alignment

Using BLAST to Compare Sequence http://www.ncbi.nlm.nih.gov/

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

Page 102: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

102

acquire some preliminary information about a particular function that you’re interested in — dUTPase.

find out more about it by retrieving a few examples of protein sequencesthat perform this function in E. coli.

Making a Multiple Sequence Alignment

ExPASy

perform a BLAST searchretrieve DNA sequence relevant to dUTPase protein of E. coli.

perform a multiple

alignment

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

Page 103: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

103

Multiple alignments are used to :

• Identify sequence positions where specific amino acids really matter for the structural integrity or the function of a given protein

• Define specific sequence signatures for protein families• Classify sequences and build evolutionary trees

Making a Multiple Sequence Alignment

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

Page 104: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

104

Making a Multiple Sequence Alignment http://pir.georgetown.edu

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

Page 105: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

105

Making a Multiple Sequence Alignment http://pir.georgetown.edu

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

Page 106: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

106

http://1.51.212.243/multi.fasta

Get sequences under :http://www.crc.sdu.edu.cn/bioinfo/2012/multi.fasta

Select all

Copy

Making a Multiple Sequence Alignment http://pir.georgetown.edu

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

Page 107: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

107

Paste

Making a Multiple Sequence Alignment http://pir.georgetown.edu

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

Page 108: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

108

Making a Multiple Sequence Alignment http://pir.georgetown.edu

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

108

Page 109: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

109

* identical

: similar

. related

different

Making a Multiple Sequence Alignment http://pir.georgetown.edu

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

Page 110: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

110

Making a Multiple Sequence Alignment http://pir.georgetown.edu

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

110

Page 111: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

111

Making a Multiple Sequence Alignment http://pir.georgetown.edu

Conserved region

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

111

Page 112: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

112

Making a Multiple Sequence Alignment http://pir.georgetown.edu

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

112

Page 113: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

113

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

113

Page 114: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

114

acquire some preliminary information about a particular function that you’re interested in — dUTPase.

find out more about it by retrieving a few examples of protein sequencesthat perform this function in E. coli.

Retrieve a protein structure

ExPASy

perform a BLAST search

perform a multiple

alignment

retrieve DNA sequence relevant to dUTPase protein of E. coli.

retrieve a protein structure

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

114

Page 115: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

115

dUTPase

protein sequence

DNA sequence

3D structure

Retrieve a protein structure

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

Page 116: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

116

Beijing

Using fields to find experts near you :

Tel : 86 - 10 - 6275-5002 Fax : 86 - 10 - 6276-2292 New Life Science Building, Peking University, Summer Palace Road No. 5, Beijing, P. R. China 100871

BeijingBeijing

Retrieve a protein structure

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

116

Page 117: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

117

Su XD dUTPase

Retrieve a protein structure http://www.pdb.org

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

117

Page 118: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

118

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

Retrieve a protein structure http://www.pdb.org

Page 119: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

119

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

Page 120: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

120

Retrieve a protein structure

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

Page 121: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

121

Retrieve a protein structure

How Most People Use Bioinformatics?

Press leftbutton

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

Page 122: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

122

Retrieve a protein structure

How Most People Use Bioinformatics?

Pressing left button

Action

Right-Click Jmol Menu

Left Click Select/DeselectResidue

Shift + Left Clickdrag mouse up or down / roll mouse middle button

Zoom

Left Click and Drag Rotate View

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

Page 123: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

123

Retrieve a protein structure

How Most People Use Bioinformatics?

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

Page 124: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

124

Retrieve a protein structure

How Most People Use Bioinformatics?

Backbone by chain

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics

Page 125: Introduction to Bioinformatics - Shandong University › Download2 › 20130613155219409.pdfBioinformatics For Dummies, 2nd Edition, Jean-Michel Claverie, Cedric Notredame, 2007, Wiley.

125

English Courses English Courses for for

Graduate StudentsGraduate Students

Introduction to Introduction to BioinformaticsBioinformatics