Algorithmic Composition - Computational Thinking in Music.pdf
Algorithmic approach to computational biology using graphs
-
Upload
s-p-sajjan -
Category
Technology
-
view
109 -
download
2
Transcript of Algorithmic approach to computational biology using graphs
Algorithmic approach to Computational
Biology using Graphs
Submitted by
S P Sajjan
Research Guide
Dr. Ishwar BaidariMCA,Ph. D.
Dept. of Computer Science
Karnatak University, Dharwad.
What is Computational Biology?"Computational biology is not a "field", but an "approach" involving
the use of computers to study biological processes and hence it is an area as
diverse as biology itself."
• Biological data
Biological data are data or measurements collected from
biological sources,
which are often stored or exchanged in a digital form.
Biological data are commonly stored in files or databases.
Ex : DNA sequences, and population data used in ecology.
• Functional molecules
In organic chemistry, functional groups are specific groups
of atoms or bonds within molecules that are responsible for the
characteristic chemical reactions of those molecules.
• Mining in molecular biology
Text-mining in molecular biology is defined as the
automatic extraction of information about genes, proteins and
their functional relationships from text documents.
Ex: Information science, Bioinformatics and Computational
linguistics.
• Defining Metabolism
The term, 'Metabolism' refers to biochemical processes
that happen within a person or living organism.
Metabolism is something that consists of both,’
Catabolism,' and, 'Anabolism;' which are the buildup and
breakdown of substances.
Cellular networks
• Interacting molecular sets
within cells.
• It includes mainly p-p
interactions, metabolism, gene
transcriptional regulatory
networks and signal
transduction pathways.
• All of them are different subsets
of a single large-scale cellular
network, since they are
eventually cross-linked.
Purpose of Computational Biology
• Computational Biology can be summarized as the field
utilizing high throughput technology and computation to study
complex organizational patterns of biological systems and
how they contribute to the normal physiology and disease.
• Experimental systems biology uses various
genomics/proteomics.
• Large number of genes or proteins at a genome scale, which
naturally yields a large volume of data to be interpreted and
put within the context of real biology.
• There are several nation-wide large projects aiming at
characterizing the genome and proteome of different (e.g
cancer) cells.
• Billions of dollars are spending into this research that spans
many of the top institutions across the nation.
• Classical molecular biology has mainly focused on gene or
molecular centric research,
• 30-40 years of this research led to our realization of the
incredible complexity of biological systems.
• we need more global experimental approaches and equally as
importantly.
Relevance of the study and present status
Issues Related to Computational Biology
• ~22,000 noted Human genes in Sequence
• ~60,000 known protein-protein interactions in human
• Millions of indirect relationships between genes
• Typical genomic experiment: millions of data points
Statement of Research Problem
• The theory of complex networks plays an important role in a
wide variety of disciplines, ranging from communication to
molecular and population biology.
• The focus of this Research is on graph theory methods for
computational biology.
• We will survey methods and approaches in graph theory,
along with current applications in biomedical informatics.
• Within the fields of Biology and Medicine, potential
applications of network analysis by using graph theory
including identifying drug targets, determining the role of
proteins or genes of unknown function.
• There are several biological domains where graph theory
techniques are applied for knowledge extraction from data.
We have classified these problems as follows.
• Modeling methods of bio-molecular networks such as protein
interaction networks, metabolic networks, as well as
transcriptional regulatory networks.
• Measurement of centrality and importance in bio-molecular
networks. To identify the most important nodes in a large
complex network is of fundamental importance in
computational biology.
• We will introduce several researches that applied centrality
measures to identify structurally important genes or proteins
identified in this way.
• Mining new pathways from bio-molecular networks.
• Experimental validation of identification of the pathway in
different organisms is requires huge amounts of time and effort.
• Thus, there is a need for Graph theory tools help scientists predict
pathways in bio-molecular networks.
• Our primary goal in the present Research is to provide as broad a
survey as possible of the major advances made in this field.
Moreover, we also highlight what has been achieved as well as
some of the most significant open issues that need to be addressed.
• Finally, we hope that this Research will serve as a useful
introduction to the field for those unfamiliar with the literature.
The concept of Graph theory
• Graph: A graph G consists of a set of vertices V(G) and set of
edges E(G).
• Simple Graph: In simple graph, two of the vertices in G are
linked if there exits an edge (𝑉𝑖, 𝑉𝑗) ∈E(G). connecting the
vertices and in graph G such that 𝑉𝑖 ∈V(G) and 𝑉𝑗 ∈V(G).
• Undirected Graph : An undirected graph is graph, i.e., a set of
objects (called vertices or nodes) that are connected together,
where all the edges are bidirectional. An undirected graph is
sometimes called an undirected network.
• Directed Graph: A directed graph is graph, i.e., a set of objects
(called vertices or nodes) that are connected together, where all
the edges are directed from one vertex to another. A directed
graph is sometimes called a digraph or a directed network.
Modeling of Bio-molecular networks in
Graph• In Biology, Transcriptional regulatory networks and metabolic
networks would usually be modeled as directed graphs.
• For instance, in a Transcriptional regulatory network, nodes
represent genes with edges denoting the Transcriptional
relationship between them.
• In recent years, attentions have been focused on the protein-
protein interaction networks of various simple organisms. These
networks describe the direct physical interaction between the
proteins in an organism’s proteome and there is no direction
associated with the interactions in such networks.
• Hence, PPI networks are typically modeled as undirected
graphs, in which nodes represent protein and edges represent
interaction.
Computational Limitations• The challenges of computational biology are enormous, and may exceed
the expected increases in computing capability. Several years ago the
computational power of “state-of-the-art parallel supercomputers”
allowed highly predictive calculations treating only hundreds of atoms for
time scales of picoseconds, while molecular dynamics calculations of tens
of thousands of atoms for nanoseconds were becoming common, although
they were some what less predictive.
• A straightforward application of Moore’s Law would predict an increase
of about three – four doublings in capability in the intervening five or six
years.
• Using current methodologies, achieving the desired level of computation
would represent an increase of greater than ~109 times in computing
power.
• It must be noted that even an increase of ~109 in computing power would
only provide the ability to simulate certain cellular systems, and may not
provide a means to predictively model whole cells, organs or organisms.