Bioinformatics Sean Langford, Larry Hale. What is it? Bioinformatics is a scientific field...

17
Bioinformat ics Sean Langford, Larry Hale

Transcript of Bioinformatics Sean Langford, Larry Hale. What is it? Bioinformatics is a scientific field...

Page 1: Bioinformatics Sean Langford, Larry Hale. What is it?  Bioinformatics is a scientific field involving many disciplines that focuses on the development.

BioinformaticsSean Langford, Larry Hale

Page 2: Bioinformatics Sean Langford, Larry Hale. What is it?  Bioinformatics is a scientific field involving many disciplines that focuses on the development.

What is it? Bioinformatics is a scientific field

involving many disciplines that focuses on the development of methods for storing, retrieving, organizing, and analyzing data from biological sources, usually sources that are of a cellular or genetic nature.

Page 3: Bioinformatics Sean Langford, Larry Hale. What is it?  Bioinformatics is a scientific field involving many disciplines that focuses on the development.

What is it? (cont.) A major focus in bioinformatics is the

production of useful software tools for generating biological knowledge using advanced skills in a variety of computer science, mathematics and engineering fields.

Page 4: Bioinformatics Sean Langford, Larry Hale. What is it?  Bioinformatics is a scientific field involving many disciplines that focuses on the development.

What it does Bioinformatics uses a variety of

techniques to develop software tools useful in producing valuable biological knowledge.

Similar to Biological Computation.

Page 5: Bioinformatics Sean Langford, Larry Hale. What is it?  Bioinformatics is a scientific field involving many disciplines that focuses on the development.

History Bioinformatics was coined by Paulien

Hogeweg in 1970 and referred to the study of information processes in biological systems.

Computers became necessary in the field of molecular biology when protein sequences became available in the 1950s and in genetics in 1982 as more genome sequences became available.

Page 6: Bioinformatics Sean Langford, Larry Hale. What is it?  Bioinformatics is a scientific field involving many disciplines that focuses on the development.

Bioinformatics vs. Biological Computation The two fields have similar aims but the

major difference is in scale. Bioinformatics deals with basic

biological data and pays attention to details while Biological Computation is a subset of CS that builds large scale theoretical models of biological systems in an attempt to expand understanding of these systems in an abstract view.

Page 7: Bioinformatics Sean Langford, Larry Hale. What is it?  Bioinformatics is a scientific field involving many disciplines that focuses on the development.

Goals Bioinformatics is now focused on the

creation and advancement of databases, algorithms, computational and statistical techniques, and theory to solve formal and practical problems arising from the management and analysis of biological data.

Page 8: Bioinformatics Sean Langford, Larry Hale. What is it?  Bioinformatics is a scientific field involving many disciplines that focuses on the development.

Goals (cont.) Some problems thus far addressed in

the pursuit of the current goal of Bioinformatics involve the production of GMOs in order to protect crops and provide gene therapy for a variety of genetic disorders.

Page 9: Bioinformatics Sean Langford, Larry Hale. What is it?  Bioinformatics is a scientific field involving many disciplines that focuses on the development.

Algorithms Rather than to list specific algorithms

used, it is more appropriate to consider what algorithms and types of algorithms that are not used.

Bioinformatics as a field is considerably broad and uses a large number of algorithms to accomplish an extremely large number of tasks.

Page 10: Bioinformatics Sean Langford, Larry Hale. What is it?  Bioinformatics is a scientific field involving many disciplines that focuses on the development.

Analyzing Data Very important goal of Bioinformatics at

the moment. Uses algorithms involving Artificial

Intelligence, Soft Computing, Data Mining, Image Processing, and Simulation.

Heavily uses Discrete Mathematics and Statistics.

Page 11: Bioinformatics Sean Langford, Larry Hale. What is it?  Bioinformatics is a scientific field involving many disciplines that focuses on the development.

Practical Example A prime example of an algorithm used

in Bioinformatics is LZW algorithm for the compression and decompression of genetic strings in order to more efficiently store the information.

A demonstration of this usage was seen in problem four of our Homework.

Page 12: Bioinformatics Sean Langford, Larry Hale. What is it?  Bioinformatics is a scientific field involving many disciplines that focuses on the development.

Problem 4 DNA sequences use the alphabet {A, C,

G, T}. Use LZW algorithm to compress the following DNA sequence:

ATGGAAGGAACTAATGGCCACCAAAACGGTTCATTTTGCTTGTCCACTGCCAAGGGAAATAATGATCCCTTGAACTGGGGAGCGGCGGCGGAGGCA

Page 13: Bioinformatics Sean Langford, Larry Hale. What is it?  Bioinformatics is a scientific field involving many disciplines that focuses on the development.

Compression Answer as best agreed on by

presenters: 0, 3, 2, 2, 0, 0, 6, 8, 1, 3, 8, 5, 2, 1, 1, 0,

17, 8, 11, 6, 3, 3, 18, 24, 24, 16, 28, 25, 18, 12, 16, 18, 9, 14, 7, 31, 12, 5, 11, 15, 10, 16, 6, 1, 48, 16, 0

Page 14: Bioinformatics Sean Langford, Larry Hale. What is it?  Bioinformatics is a scientific field involving many disciplines that focuses on the development.

Practical Example Part 2 Another example of a common

algorithm used in Bioinformatics is the use of algorithms designed to find the longest common subsequence.

This is also demonstrated within the homework in problem 1.

Page 15: Bioinformatics Sean Langford, Larry Hale. What is it?  Bioinformatics is a scientific field involving many disciplines that focuses on the development.

Problem 1 Given the following DNA sequences, S1

= (ATGGAACGAACT) and S2=(TTGCCAGTAC), find the Longest Common Subsequence (LCS) of S1 and S2 using the table below. Show the values and arrows in each cell and circle the cells that are part of the LCS. Write the LCS.

Page 16: Bioinformatics Sean Langford, Larry Hale. What is it?  Bioinformatics is a scientific field involving many disciplines that focuses on the development.

Longest Common Subsequence Answer Best Agreed on by presenters. T- G- - - G- AC

Page 17: Bioinformatics Sean Langford, Larry Hale. What is it?  Bioinformatics is a scientific field involving many disciplines that focuses on the development.

Questions? Dost thou haveth any?