8/9/2015dmitra1 Molecular Biology Background Debasis Mitra Florida Tech Credit: Pevezner text-site.

Post on 23-Dec-2015

214 views 0 download

Tags:

Transcript of 8/9/2015dmitra1 Molecular Biology Background Debasis Mitra Florida Tech Credit: Pevezner text-site.

04/19/23 dmitra 1

Molecular Biology Background

Debasis MitraFlorida Tech

Credit: Pevezner text-site

04/19/23 dmitra 2

Section1: What is Life made of?

04/19/23 dmitra 3

2 types of cells: Prokaryotes v.s.Eukaryotes

04/19/23 dmitra 4

Life begins with Cell

A cell is a smallest structural unit of an organism that is capable of independent functioning

All cells have some common features

04/19/23 dmitra 5

Prokaryotes and Eukaryotes

•According to the most recent evidence, there are three main branches to the tree of life. •Prokaryotes include Archaea (“ancient ones”) and bacteria.•Eukaryotes are kingdom Eukarya and includes plants, animals, fungi and certain algae.

04/19/23 dmitra 6

Prokaryotes and Eukaryotes, continued

Prokaryotes Eukaryotes

Single cell Single or multi cell

No nucleus Nucleus

No organelles Organelles

One piece of circular DNA

Chromosomes

No mRNA post transcriptional modification

Exons/Introns splicing

04/19/23 dmitra 7

Overview of organizations of life Nucleus = library Chromosomes = bookshelves Genes = books Almost every cell in an organism

contains the same libraries and the same sets of books.

Books represent all the information (DNA) that every cell in the body needs so it can grow and carry out its vaious functions.

04/19/23 dmitra 8

Chromosomes

Organism Number of base pair number of Chromosomes

---------------------------------------------------------------------------------------------------------

Prokayotic

Escherichia coli (bacterium) 4x106 1

Eukaryotic

Saccharomyces cerevisiae (yeast) 1.35x107 17

Drosophila melanogaster(insect) 1.65x108 4

Homo sapiens(human) 2.9x109 23

Zea mays(corn) 5.0x109 10

04/19/23 dmitra 9

Bio-molecules

Nucleic acids (DNA, RNA): Library of life

Proteins: Workhorse of life Fatty acids, carbohydrates, and other

supporting molecules

04/19/23 dmitra 10

DNA DNA has a double helix

structure which composed of sugar molecule phosphate group and a base (A,C,G,T)

DNA always reads from 5’ end to 3’ end for transcription replication 5’ ATTTAGGCC 3’3’ TAAATCCGG 5’

04/19/23 dmitra 11

DNA, RNA, and the Flow of Information

TranslationTranscription

Replication

04/19/23 dmitra 12

Proteins Functions

Structural Enzymes Information exchange (e.g., across

cell walls) Transporting other molecules (e.g.,

oxygen to cells) Activating-deactivating genes Etc.

04/19/23 dmitra 13

Proteins

Amino acids

Protein is a chain of “residues”

20 to 5000 long, typically a few hundred long

04/19/23 dmitra 14

Protein structure

Important for its function Primary structure: sequence Secondary structure: a few

topological features Tertiary structure: 3D folding Quaternary structure: Protein

complex

04/19/23 dmitra 15

Protein Folding Proteins tend to fold into the lowest free

energy conformation. Proteins begin to fold while the peptide is

still being translated. Proteins bury most of its hydrophobic

residues in an interior core to form an α helix.

Most proteins take the form of secondary structures α helices and β sheets.

Molecular chaperones, hsp60 and hsp 70, work with other proteins to help fold newly synthesized proteins.

Much of the protein modifications and folding occurs in the endoplasmic reticulum and mitochondria.

04/19/23 dmitra 16

Protein Folding (cont’d)

The structure that a protein adopts is vital to it’s chemistry

Its structure determines which of its amino acids are exposed carry out the protein’s function

Its structure also determines what substrates it can react with

04/19/23 dmitra 17

Nucleic acids

Two types:

DNA: Deoxy-ribonucleic acid RNA: Ribonucleic acid

04/19/23 dmitra 18

Nucleic acids Sugar molecule chain forms the

base of the polymer

Two types of sugar: ribose (RNA), 2’-deoxyribose (DNA)

04/19/23 dmitra 19

Nucleic acids: DNA

4 types of bases connected to sugar molecules: Adenine (a), Guanine (g), Thymine (t) and Cytosine (c)

A and T forms strong bonds, and so do G and C

An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

04/19/23 2015

The Purines The Pyrimidines

An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

04/19/23 dmitra 21

DNA• DNA has a double helix

structure which composed of • sugar molecule• phosphate group• and a base (A,C,G,T)

• DNA always reads from 5’ end to 3’ end for transcription replication 5’ ATTTAGGCC 3’3’ TAAATCCGG 5’

04/19/23 dmitra 22

Nucleic acids: DNA

Double stranded: two strands of sugar molecule-chains

Each strand is directed: 5’ to 3’

Attached inside by base-pairings (a-t and g-c)

An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

04/19/23 2315

Double helix of DNA

04/19/23 dmitra 24

Discovery of DNA

DNA Sequences Chargaff and Vischer, 1949

DNA consisting of A, T, G, C• Adenine, Guanine, Cytosine, Thymine

Chargaff Rule Noticing #A#T and #G#C

• A “strange but possibly meaningless” phenomenon.

Wow!! A Double Helix Watson and Crick, Nature, April 25, 1953

Rich, 1973 Structural biologist at MIT. DNA’s structure in atomic resolution.

Crick Watson1 Biologist1 Physics Ph.D. Student900 wordsNobel Prize

04/19/23 dmitra 25

Nucleic acids: DNA

Each strand is complementary and reverse to the other

If s=agacgt

reverse(s)=tgcaga

reverse-complement(s)=acgtct

Double-strand: 5’--agacgt->3’

3’<-t ctgca—5’

04/19/23 dmitra 26

Nucleic acids: DNA 3D structure is helical

Double-stranded helix: like step ladder

Each unit is a base pair (sugar-base-base-sugar)

DNA’s in cells are chromosomes (human chromosome ~3*(10^9) bp long)

Squeezed 3D structure in cell may have functional importance – not well studied

04/19/23 dmitra 27

DNA Replication

04/19/23 dmitra 28

Nucleic acids: RNA

Replace t with u (uracil) as base

May or may not be (mostly not) double stranded

Functions: Information storage like DNA, sometimes workhorse like proteins

Possible evolutionary precursor to DNA and protein

04/19/23 dmitra 29

Genetic code

Proteins do almost all the works!!

Information for coding proteins are stored on DNA’s (or RNA’s): genes

Three consecutive bases on a gene codes an amino acid, or the STOP code: codon

The table is called genetic code

04/19/23 dmitra 30

Cell Information: Instruction book of Life

DNA, RNA, and Proteins are examples of strings written in either the four-letter nucleotide of DNA and RNA (A C G T/U)

or the twenty-letter amino acid of proteins. Each amino acid is coded by 3 nucleotides called codon. (Leu, Arg, Met, etc.)

04/19/23 dmitra 31

Overview of DNA to RNA to Protein

A gene is expressed in two steps

1) Transcription: RNA synthesis2) Translation: Protein synthesis

An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

04/19/23 3215

Central Dogma of Biology

The information for making proteins is stored in DNA. There is a process (transcription and translation) by which DNA is converted to protein. By understanding this process and how it is regulated we can make predictions and models of cells.

Sequence analysis

Gene Finding

Protein Sequence Analysis

Assembly

04/19/23 dmitra 33

Transcription

Genes are transcribed to proteins: typically one gene to one protein

Genes are subsequenes on chromosomes started by a promoter region, ended around a stop codon

04/19/23 dmitra 34

Transcription

Steps: DNA is split over gene after

promoter is recognized (may have other regulatory regions upstream)

mRNA is copied from the gene Exons are spliced out from the

mRNA keeping the introns only Ribosome (rRNA and protein

complex) works on mRNA

04/19/23 dmitra 35

Transcription The process of

making RNA from DNA

Catalyzed by “transcriptase” enzyme

Needs a promoter region to begin transcription.

~50 base pairs/second in bacteria, but multiple transcriptions can occur simultaneously

http://ghs.gresham.k12.or.us/science/ps/sci/ibbio/chem/nucleic/chpt15/transcription.gif

04/19/23 dmitra 36

Definition of a Gene

Regulatory regions: up to 50 kb upstream of +1 site

Exons: protein coding and untranslated regions (UTR)1 to 178 exons per gene (mean 8.8)8 bp to 17 kb per exon (mean 145 bp)

Introns: splice acceptor and donor sites, junk DNAaverage 1 kb – 50 kb per intron

Gene size: Largest – 2.4 Mb (Dystrophin). Mean – 27 kb.

04/19/23 dmitra 37

Translation

tRNA are attached to codons on mRNA

On the other end the tRNA attracts appropriate amino acid

Amino acids are zipped up No tRNA for STOP codon Every step is facilitated by

appropriate enzymeCentral Dogma of biology

04/19/23 dmitra 38

Translation, continued Catalyzed by Ribosome Using two different sites,

the Ribosome continually binds tRNA, joins the amino acids together and moves to the next location along the mRNA

~10 codons/second, but multiple translations can occur simultaneously

http://wong.scripps.edu/PIX/ribosome.jpg

04/19/23 dmitra 39

Revisiting the Central Dogma

In going from DNA to proteins, there is an intermediate step where mRNA is made from DNA, which then makes protein This known as The

Central Dogma Why the intermediate step?

DNA is kept in the nucleus, while protein sythesis happens in the cytoplasm, with the help of ribosomes

04/19/23 dmitra 40

The Central Dogma (cont’d)

04/19/23 dmitra 41

Open Reading Frame

Three reading frames in a strand Complementary strand may have

another three frames

04/19/23 dmitra 42

Types of chromosomes

Procaryotes (bacteria, blue algae): circular

Eucaryotes (has nuclear wall): diploid (human has 23 pairs)

Homologous genes and alleles (e.g., human hemoglobin of type A, B, and O)

Haploid chromosomes in Eucaryote sex cells

04/19/23 dmitra 43

DNA Sequencing

A DNA fragment is split at each position starting from one end

Four tubes: one containing molecules ending with G, one with A, one with T and another one with C

Electrophoresis separates each chunk of different size in each tube [page 22]

Information is recombined to sequence the DNA chunk

Can be done for the size of only ~1K bp long chunk

04/19/23 dmitra 44

DNA Sequencing

Human DNA is ~10^9 bp long

Restriction enzyme cuts at restriction sites (a product of genetic engineering) [page 18]

After sequencing, information from fragments need to be recombined to get the broader picture

04/19/23 dmitra 45

DNA Sequencing

Depends on finding restriction site/enzyme for fragmenting DNA of appropriate size

Privately funded Tiger project (Celera now) used heat and vibration to create fragments

Recombining information is no longer trivial because fragment’s location is no longer known

Needed Fragment assembly algorithm

04/19/23 dmitra 46

DNA Sequencing

Needs multiple copies of DNAs Recombinant DNA by biologically

copying them within host organisms

Polymerase Chain Reaction: heat and tear two strands of DNA, then let each strand attract nucleic acids to form double stranded DNA, repeat