REMINDERS
description
Transcript of REMINDERS
![Page 1: REMINDERS](https://reader036.fdocuments.us/reader036/viewer/2022081515/56816376550346895dd45483/html5/thumbnails/1.jpg)
REMINDERS 2nd Exam on Nov.17 Coverage:
Central Dogma of DNA• Replication• Transcription• Translation
Cell structure and functionRecombinant DNA technology and
molecular biologyProtein analysis
![Page 2: REMINDERS](https://reader036.fdocuments.us/reader036/viewer/2022081515/56816376550346895dd45483/html5/thumbnails/2.jpg)
BIOINFORMATICS
![Page 3: REMINDERS](https://reader036.fdocuments.us/reader036/viewer/2022081515/56816376550346895dd45483/html5/thumbnails/3.jpg)
BIOINFORMATICS Study of the structure of biological
information and biological systems Integrates theories and tools of
mathematics/statistics, computer science and information technology
Involves the use of hardware and software to study vast amounts of biological data
![Page 4: REMINDERS](https://reader036.fdocuments.us/reader036/viewer/2022081515/56816376550346895dd45483/html5/thumbnails/4.jpg)
What is Bioinformatics? the field of science in which biology,
computer science, and information technology merge to form a single discipline
application of information technology to the storage, management and analysis of biological information
facilitated by the use of computers
![Page 5: REMINDERS](https://reader036.fdocuments.us/reader036/viewer/2022081515/56816376550346895dd45483/html5/thumbnails/5.jpg)
![Page 6: REMINDERS](https://reader036.fdocuments.us/reader036/viewer/2022081515/56816376550346895dd45483/html5/thumbnails/6.jpg)
FUNCTIONS Data Management
StorageRetrieval
Data Analysis
*Literature/Bibliography, Sequence, Structure, Taxonomy, Expression, etc.
![Page 7: REMINDERS](https://reader036.fdocuments.us/reader036/viewer/2022081515/56816376550346895dd45483/html5/thumbnails/7.jpg)
BIOLOGICAL DATABASES Systematic data storage/retrieval Maintained on a regular basis Can contain various types of data
(integration)SequenceStructureOther pertinent information
Nucleotides and proteins are most common
![Page 8: REMINDERS](https://reader036.fdocuments.us/reader036/viewer/2022081515/56816376550346895dd45483/html5/thumbnails/8.jpg)
DATABASES
a large, organized body of persistent data, usually associated with computerized software designed to update, query, and retrieve components of the data stored within the system
Biological databases consist usually of the nucleic acid sequences of the genetic material of various organisms as well as protein sequences and structures
![Page 9: REMINDERS](https://reader036.fdocuments.us/reader036/viewer/2022081515/56816376550346895dd45483/html5/thumbnails/9.jpg)
DATABASES
e.g. nucleotide sequence database typically contains information such as contact name the input sequence with a description of the
type of molecule the scientific name of the source organism
from which it was isolated additional requirements
easy access to the information a method for extracting only that information
needed to answer a specific biological question
![Page 10: REMINDERS](https://reader036.fdocuments.us/reader036/viewer/2022081515/56816376550346895dd45483/html5/thumbnails/10.jpg)
DATABASES• Sequence
– GenBank, European Nucleotide Archive (ENA) and DNA Data Bank of Japan (DDBJ); managed by the International Nucleotide Sequence Database Collaboration (INSDC)
– UniGene– Saccharomyces Genome Database (SGD)– UniProtKB (UniProtKB/Swiss-Prot or
UniProt/TrEMBL)– ExPASy
![Page 11: REMINDERS](https://reader036.fdocuments.us/reader036/viewer/2022081515/56816376550346895dd45483/html5/thumbnails/11.jpg)
DATABASES Structure
Nucleic Acid Database (NDB) Protein Data Bank (PDB)Worldwide Protein Data Bank (wwPDB)ExPASy
![Page 12: REMINDERS](https://reader036.fdocuments.us/reader036/viewer/2022081515/56816376550346895dd45483/html5/thumbnails/12.jpg)
DATA MINING Process by which testable hypotheses
are created regarding function/structure of gene/protein of interest through identifying similar sequences in “more established” organisms
Tools:Text-term searchSequence similarity search
![Page 13: REMINDERS](https://reader036.fdocuments.us/reader036/viewer/2022081515/56816376550346895dd45483/html5/thumbnails/13.jpg)
Machine Learning Studies methods and the design of
computer programs based on past experience
Why?New methods are being introducedOld ones should be improved
![Page 14: REMINDERS](https://reader036.fdocuments.us/reader036/viewer/2022081515/56816376550346895dd45483/html5/thumbnails/14.jpg)
“Units” of Information DNA (genome) RNA (transcriptome) Protein (proteome)
![Page 15: REMINDERS](https://reader036.fdocuments.us/reader036/viewer/2022081515/56816376550346895dd45483/html5/thumbnails/15.jpg)
What is Being Analyzed? Sequence Structure Interactions Pathways Mutations/Evolutions
![Page 16: REMINDERS](https://reader036.fdocuments.us/reader036/viewer/2022081515/56816376550346895dd45483/html5/thumbnails/16.jpg)
Why? Increasing amount of biological
information entailsOrganizationArchiving
Global unification/harmonization More biological discoveries
Functional/Structural similaritiesPhylogenetic/Evolutionary patterns
![Page 17: REMINDERS](https://reader036.fdocuments.us/reader036/viewer/2022081515/56816376550346895dd45483/html5/thumbnails/17.jpg)
Applications Medicine Pharmaceuticals Biotechnology Agriculture
![Page 18: REMINDERS](https://reader036.fdocuments.us/reader036/viewer/2022081515/56816376550346895dd45483/html5/thumbnails/18.jpg)
STRUCTURE DATABASES
![Page 19: REMINDERS](https://reader036.fdocuments.us/reader036/viewer/2022081515/56816376550346895dd45483/html5/thumbnails/19.jpg)
Molecular Data
• When you draw a molecule,– You start with atoms– Then proceed with the structure– And the three-dimensional data
• What can be stored?– Coordinates– Sequences– Chemical graphs
• Atoms and bonds
![Page 20: REMINDERS](https://reader036.fdocuments.us/reader036/viewer/2022081515/56816376550346895dd45483/html5/thumbnails/20.jpg)
Databases Protein Data Bank (PDB) Molecular Modeling Database (MMDB)
![Page 21: REMINDERS](https://reader036.fdocuments.us/reader036/viewer/2022081515/56816376550346895dd45483/html5/thumbnails/21.jpg)
Techniques in the Laboratory X-ray Crystallography Nuclear Magnetic Resonance
![Page 22: REMINDERS](https://reader036.fdocuments.us/reader036/viewer/2022081515/56816376550346895dd45483/html5/thumbnails/22.jpg)
Formats PDB mmCIF MMDB
![Page 23: REMINDERS](https://reader036.fdocuments.us/reader036/viewer/2022081515/56816376550346895dd45483/html5/thumbnails/23.jpg)
Structure Viewers Cn3D RasMol WebMol Mage VRML CAD Swiss PDB Viewer
![Page 24: REMINDERS](https://reader036.fdocuments.us/reader036/viewer/2022081515/56816376550346895dd45483/html5/thumbnails/24.jpg)
Promises of bioinformatics
Medicine Knowledge of protein structure facilitates
drug design Understanding of genomic variation allows
the tailoring of medical treatment to the individual’s genetic make-up
Genome analysis allows the targeting of genetic diseases
The effect of a disease or of a therapeutic on RNA and protein levels can be elucidated
The same techniques can be applied to biotechnology, crop and livestock improvement, etc...
![Page 25: REMINDERS](https://reader036.fdocuments.us/reader036/viewer/2022081515/56816376550346895dd45483/html5/thumbnails/25.jpg)
Challenges in bioinformatics Explosion of information
Need for faster, automated analysis to process large amounts of data
Need for integration between different types of information (sequences, literature, annotations, protein levels, RNA levels etc…)
Need for “smarter” software to identify interesting relationships in very large data sets
Lack of “bioinformaticians” Software needs to be easier to access, use
and understand Biologists need to learn about the software, its
limitations, and how to interpret its results
![Page 26: REMINDERS](https://reader036.fdocuments.us/reader036/viewer/2022081515/56816376550346895dd45483/html5/thumbnails/26.jpg)
SEQUENCE ALIGNMENT
![Page 27: REMINDERS](https://reader036.fdocuments.us/reader036/viewer/2022081515/56816376550346895dd45483/html5/thumbnails/27.jpg)
Two or More Sequences Measure similarity Determine correspondences between
residues Find patterns of conservation Derive evolutionary relationships
![Page 28: REMINDERS](https://reader036.fdocuments.us/reader036/viewer/2022081515/56816376550346895dd45483/html5/thumbnails/28.jpg)
Alignment Correspondences of nucleotides/amino
acids in two sequences or more are assignedAn assignment of correspondences that
preserves the order of the residues within the sequences is an alignment
Gaps are used to achieve this Sequence alignment refers to the
identification of residue-residue correspondences
![Page 29: REMINDERS](https://reader036.fdocuments.us/reader036/viewer/2022081515/56816376550346895dd45483/html5/thumbnails/29.jpg)
Uses Homology
Similarities“Ancestry”
Genome annotationAssigning structure and function to
genes Database queries
For newly-discovered/unknown sequences
![Page 30: REMINDERS](https://reader036.fdocuments.us/reader036/viewer/2022081515/56816376550346895dd45483/html5/thumbnails/30.jpg)
Tools• Dot Plots
– Diagonal lines of dots showing similarities between two sequences
• Scoring Matrices– Score reflects quality of each possible
alignment; best possible score is identified– Scoring scheme is crucial– PAM (Point Accepted Mutations) and
BLOSUM (BLOCKS Substitution Matrix)• Dynamic Programming
– Algorithmic technique that reuses previous computations
![Page 31: REMINDERS](https://reader036.fdocuments.us/reader036/viewer/2022081515/56816376550346895dd45483/html5/thumbnails/31.jpg)
Scoring Penalties/Scores
Match (e.g. A – A)Mismatch (e.g. A C)Gap (e.g. A _)
• Linear Gap Penalty: Uniform• Affine Gap Penalty: Gap Existence vs. Gap
Extension
![Page 32: REMINDERS](https://reader036.fdocuments.us/reader036/viewer/2022081515/56816376550346895dd45483/html5/thumbnails/32.jpg)
Local vs. Global Alignments Global Alignment
Similarities between majority of two sequences
Local AlignmentSimilarities between specific parts of
two sequences
![Page 33: REMINDERS](https://reader036.fdocuments.us/reader036/viewer/2022081515/56816376550346895dd45483/html5/thumbnails/33.jpg)
Programs
Pairwise Sequence Alignment BLAST VAST FASTA
Multiple Sequence Alignment MAFFT
![Page 34: REMINDERS](https://reader036.fdocuments.us/reader036/viewer/2022081515/56816376550346895dd45483/html5/thumbnails/34.jpg)
Needleman-Wunsch Algorithm• Can be used for global and alignments• Maximum-value function• A simple scoring scheme is assumed
Three steps– Initialization – Matrix fill (scoring) – Traceback (alignment)