Post on 26-Jun-2020
BioinformaticsSome selected examples . . . and a bit of an overview
Ingo Ruczinski
Department of BiostatisticsJohns Hopkins Bloomberg School of Public Health
July 19, 2007 @ EnviroHealth Connections
Ingo Ruczinski Bioinformatics / Computational Biology
Bioinformatics and Computational Biology
Wikipedia:
Bioinformatics and computational biology involve the use oftechniques including applied mathematics, informatics, statistics,computer science, artificial intelligence, chemistry, and biochemistryto solve biological problems usually on the molecular level.· · ·
Major research efforts in the field include sequence alignment, genefinding, genome assembly, protein structure alignment, proteinstructure prediction, prediction of gene expression and protein-proteininteractions, and the modeling of evolution.
Ingo Ruczinski Bioinformatics / Computational Biology
Bioinformatics and Computational Biology
Wikipedia:
The terms bioinformatics and computational biology are often usedinterchangeably. However bioinformatics more properly refers to thecreation and advancement of algorithms, computational and statisticaltechniques, and theory to solve formal and practical problemsinspired from the management and analysis of biological data.Computational biology, on the other hand, refers to hypothesis-driveninvestigation of a specific biological problem using computers, carriedout with experimental or simulated data, with the primary goal ofdiscovery and the advancement of biological knowledge.
Ingo Ruczinski Bioinformatics / Computational Biology
Bioinformatics and Computational Biology
NIH definition of Bioinformatics and Computational Biology:
Bioinformatics and computational biology are rooted in life sciencesas well as computer and information sciences and technologies. Bothof these interdisciplinary approaches draw from specific disciplinessuch as mathematics, physics, computer science and engineering,biology, and behavioral science.· · ·
Bioinformatics applies principles of information sciences andtechnologies to make the vast, diverse, and complex life sciencesdata more understandable and useful. Computational biology usesmathematical and computational approaches to address theoreticaland experimental questions in biology. Although bioinformatics andcomputational biology are distinct, there is also significant overlapand activity at their interface.
Ingo Ruczinski Bioinformatics / Computational Biology
Bioinformatics and Computational Biology
NIH definition of Bioinformatics and Computational Biology:
The NIH Biomedical Information Science and Technology InitiativeConsortium agreed on the following definitions of bioinformatics andcomputational biology recognizing that no definition could completelyeliminate overlap with other activities or preclude variations ininterpretation by different individuals and organizations.
Bioinformatics: Research, development, or application ofcomputational tools and approaches for expanding the use ofbiological, medical, behavioral or health data, including those toacquire, store, organize, archive, analyze, or visualize such data.
Computational Biology: The development and application ofdata-analytical and theoretical methods, mathematical modeling andcomputational simulation techniques to the study of biological,behavioral, and social systems.
Ingo Ruczinski Bioinformatics / Computational Biology
The central dogma of biology
Drawn by Ebbe Sloth Andersen http://old.mb.au.dk/
Ingo Ruczinski Bioinformatics / Computational Biology
Topics
DNA −→ RNA −→ Protein
DNA Sequence analysis, genome annotation, evolutionarybiology, phylogeny, DNA alterations, comparativegenomics, SNP association studies
RNA Analysis of gene expression and regulation
Proteins Analysis of protein expression, protein-protein docking,prediction of protein structure
Systems Biology:modeling biological systems, gene/protein networks
Ingo Ruczinski Bioinformatics / Computational Biology
Some selected examples
1 Chromosomal alterations
2 Protein structure prediction
3 2D gel electrophoresis
Ingo Ruczinski Bioinformatics / Computational Biology
Some selected examples
1 Chromosomal alterations
2 Protein structure prediction
3 2D gel electrophoresis
Ingo Ruczinski Bioinformatics / Computational Biology
Karyotypes
Ingo Ruczinski Bioinformatics / Computational Biology
Trisomy
http://www.medgen.ubc.ca
Ingo Ruczinski Bioinformatics / Computational Biology
DNA changes
Ingo Ruczinski Bioinformatics / Computational Biology
The data
Ingo Ruczinski Bioinformatics / Computational Biology
Deletion
Ingo Ruczinski Bioinformatics / Computational Biology
FISH
Ingo Ruczinski Bioinformatics / Computational Biology
Amplification
Ingo Ruczinski Bioinformatics / Computational Biology
Uniparental Isodisomy
Ingo Ruczinski Bioinformatics / Computational Biology
Cancer samples
Ingo Ruczinski Bioinformatics / Computational Biology
Mosaicism
Ingo Ruczinski Bioinformatics / Computational Biology
SNPchip S4 classes and methods
Ingo Ruczinski Bioinformatics / Computational Biology
Estimation
1 By SNP:
Estimate genotype and copy number for each SNP.
2 Within a sample:
Borrow strength between SNPs to infer regions of LOHand copy number changes.
3 Between samples:
Comparison between normal and disease populations tofind chromosomal alterations associated with disease.
Ingo Ruczinski Bioinformatics / Computational Biology
Vanilla ICE
1
2
3
5
1
2
3
4
5A B CD E
ICE
Van
DeletionNormal
LOHAmplification
50 51 52 53 54 55
1
2
3
A
69 70 71 72 73 74
D
174 176 178
B
238 240 242 244 246
E
Mb
ICE
Van
Ingo Ruczinski Bioinformatics / Computational Biology
A HapMap sample
100 150 200
1
2
345
ICE
Van
DeletionNormal
LOHAmplification
190
135 140 145 150
1
2
345
143
Mb
149 149.5 150 150.5 151 151.5 152
ICE
Van
Ingo Ruczinski Bioinformatics / Computational Biology
Many HapMap samples
Ingo Ruczinski Bioinformatics / Computational Biology
SNP Trio
Ingo Ruczinski Bioinformatics / Computational Biology
HMM for SNP Trio
chromosome 10
position (Mb)
0 20 40 60 80 100 120 140
MI−S
MI−D
UPI−M
UPI−F
BPI
BPI
non−BPI
chromosome 22
position (Mb)
15 20 25 30 35 40 45 50
MI−S
MI−D
UPI−M
UPI−F
BPI
BPI
non−BPI
Ingo Ruczinski Bioinformatics / Computational Biology
HMM for SNP Trio
80.9
1
MI−S
MI−D
UPI−M
UPI−P
BPI
HMM
child
copy
num
ber
−2−1.5
−1−0.5
00.5
1
mother
copy
num
ber
−2−1.5
−1−0.5
00.5
1
father
copy
num
ber
−2−1.5
−1−0.5
00.5
1
0 20 40 60 80 100 120 135
80.9
1
MI−S
MI−D
UPI−M
UPI−P
BPI
HMM
child
copy
num
ber
−2−1.5
−1−0.5
00.5
1
mother
copy
num
ber
−2−1.5
−1−0.5
00.5
1
father
copy
num
ber
−2−1.5
−1−0.5
00.5
1
15 20 25 30 35 40 45 49
Ingo Ruczinski Bioinformatics / Computational Biology
Some selected examples
1 Chromosomal alterations
2 Protein structure prediction
3 2D gel electrophoresis
Ingo Ruczinski Bioinformatics / Computational Biology
Proteins
−→ Amino acids are the building blocks of proteins.
Ingo Ruczinski Bioinformatics / Computational Biology
Proteins
−→ Both figures show the same protein (the bacterial protein L).
The right figure also highlights the secondary structure elements.
Ingo Ruczinski Bioinformatics / Computational Biology
Proteins
From Lehninger, Principles of Biochemistry
Ingo Ruczinski Bioinformatics / Computational Biology
Functional Annotation
Ingo Ruczinski Bioinformatics / Computational Biology
Genome Wide Annotation
Ingo Ruczinski Bioinformatics / Computational Biology
Some selected examples
1 Chromosomal alterations
2 Protein structure prediction
3 2D gel electrophoresis
Ingo Ruczinski Bioinformatics / Computational Biology
2D Gel Electrophoresis
Ingo Ruczinski Bioinformatics / Computational Biology
2D Gel Electrophoresis
0 100 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500
0 100 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500
1300
1200
1100
1000
900
800
700
600
500
400
300
200
100
0
1300
1200
1100
1000
900
800
700
600
500
400
300
200
100
0
0 100 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500
0 100 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500
1300
1200
1100
1000
900
800
700
600
500
400
300
200
100
0
1300
1200
1100
1000
900
800
700
600
500
400
300
200
100
0
Ingo Ruczinski Bioinformatics / Computational Biology
2D Gel Electrophoresis
A:1 A:2 A:3 A:4 A:5 A:6 A:7 A:8 A:9 A:10 A:11 A:12 B:1 B:2 B:3 B:4 B:5 B:6 B:7 B:8 B:9 B:10 B:11 B:12
A B
Ingo Ruczinski Bioinformatics / Computational Biology
2D Gel Electrophoresis
A:1 A:2 A:3 A:4 B:1 B:2 B:3 B:4
A B
Ingo Ruczinski Bioinformatics / Computational Biology
2D Gel Electrophoresis
−20
0
20
40
60
% r
educ
tion
of c
once
ntra
tion
as c
ompa
red
to b
ackg
roun
d 1st Trimester
3rd Trimester
Folate Placebo
Ingo Ruczinski Bioinformatics / Computational Biology
2D Gel Electrophoresis
0 100 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500
0 100 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500
1300
1200
1100
1000
900
800
700
600
500
400
300
200
100
0
1300
1200
1100
1000
900
800
700
600
500
400
300
200
100
0
Ingo Ruczinski Bioinformatics / Computational Biology
2D Gel Electrophoresis
0 100 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500
0 100 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500
1300
1200
1100
1000
900
800
700
600
500
400
300
200
100
0
1300
1200
1100
1000
900
800
700
600
500
400
300
200
100
0
790
1056
853
662
1458
948
1026
770
586
768
797
332
248
Ingo Ruczinski Bioinformatics / Computational Biology
http://biostat.jhsph.edu/∼iruczins/
Ingo Ruczinski Bioinformatics / Computational Biology