Bioinformatics Topics Not Covered in this Course BMI 730

Post on 08-Jan-2016

40 views 1 download

Tags:

description

Bioinformatics Topics Not Covered in this Course BMI 730. Kun Huang Department of Biomedical Informatics Ohio State University. Non-coding RNA MicroRNA Related Bioinformatics Issues MicroRNA prediction and recognition Second order structure prediction Target prediction - PowerPoint PPT Presentation

Transcript of Bioinformatics Topics Not Covered in this Course BMI 730

Bioinformatics Topics Not Covered in this Course

BMI 730 Kun Huang

Department of Biomedical InformaticsOhio State University

Non-coding RNA MicroRNA Related Bioinformatics Issues

MicroRNA prediction and recognition Second order structure prediction Target prediction

Microbial Related Bioinformatics Metagenomics

Other Omics

Other Informatics

Non-coding RNA• Non-coding DNA

• Junk DNA• Pseudogenes• Retrotransposons - Human Endogenous

Retroviruses (HERVs)• C-value enigma (e.g., Amoeba dubia genome

has more than 670 billion bases; pufferfish genome is 1/10 of human genome)

• Findings from ENCODE – nearly the entire genome is transcribed

Non-coding RNA (ncRNA)• Any RNA molecule that is not translated into

a protein. • sRNA, npcRNA, nmRNA, snmRNA, fRNA• Also including tRNA, rRNA, snoRNA,

microRNA (miRNA), siRNA, piRNA, long ncRNA (e.g., Xist), shRNA

• Note the difference between siRNA and miRNA

Non-coding RNA (ncRNA)• RNA-induced silencing complex (RISC)• RNA-induced transcriptional silencing (RITS)

MicroRNA (miRNA)• Another level of regulation

a

p

m

1

2

b

E2F1

E2F2

E2F3 Myc

17-5p 17-3p 18a 19a 20a 19b 92-1

c

Myc E2F

mir-17-92

Reviewed by: Coller et al. (2008), PLoS Genet 3(8): e146Figures from Dr. Baltz Agula

MicroRNA (miRNA)

Non-coding RNA

MicroRNA Related Bioinformatics Issues• Secondary structure prediction• MicroRNA prediction and recognition• Target prediction

Databases

Secondary structure prediction• Applications

• RNA folding dynamics• ncRNA discovery• Microarray probe validation/comparison

Wang et al. Genome Biology 2004 5:R65  

Secondary structure prediction - Physics-based models

- Minimizing free energy / Dynamical programming / other optimization schemes

- Parameters come from empirical studies of RNA structural energetics (e.g., nearest neighbor interactions in stacking base pairs using synthesized oligonucleotides)

- Restricted from experimental procedure- Scoring models are used- Most ignore sequence dependence of hairpin, bulge,

internal, and multi-branch loop energies- Multi-branch loop energies rely on ad hoc scores- Still top performance- Mfold, ViennaRNA, PKnots, RDfold, etc

Secondary structure prediction - Probabilistic approach- Stochastic context-free grammars (SCFG) – e.g., QRNA

- Specify grammar rules that induce a joint probability distribution over possible RNA structures and sequences

- Parameter easily learnt without experiments- Parameters may not have physical meanings- Performance inferior to physics-model methods

- Extensions: Conditional log-linear model (CLLM) – e.g., CONTRAfold

- Integrate the learning procedure with energy-based scoring systems

Secondary structure prediction

CONTRAfold PKnotRG

Secondary structure prediction - Comparative approach- Single sequence prediction (physics-based, SCFG) have

difficulty in searching all configurations- Structures that have been conserved by evolution are far

more likely to be the functional form

MicroRNA prediction and discovery- Experimental approach - cloning- MicroRNA array (OSU microarray facility)- Massive sequencing

- Select segments in the range of 20-25nt- Using Solexa/SOLiD sequencer- Map to genome- Enrichment analysis / peak calling- Experimental validation

MicroRNA prediction and discovery- Bioinformatics / machine learning approach

Wang et al. Genome Biology 2004 5:R65  

MicroRNA prediction and discovery- Bioinformatics / machine learning approach

- Using evolutionary information

Nam, J.-W. et al. Nucl. Acids Res. 2005 33:3570-3581; doi:10.1093/nar/gki668

MicroRNA prediction and discovery- Bioinformatics / machine learning approach

- Support vector machine / need features

• Features: • Sequence features

• Nucleotide frequency counts• Total G/C content

• Folding features• Pairing propensity• Minimum free energy (MFE)

• Topological features• Packing ratio

MicroRNA target Prediction- Experimental / bioinformatics approach

- Blast can identify thousands potential targets – how to pin down the real ones?

MicroRNA target Prediction- Computational / bioinformatics approach

- Mutually exclusive transcription pattern between miRNA and its targets

- Microarray screening- Existing of complementary sequence- Context score – features - Machine learning approaches (e.g., SVM,

regression, etc)

Cell, Volume 136, Issue 2, 215-233, 23 January 2009MicroRNAs: Target Recognition and Regulatory Functions

David P. Bartel

Non-coding RNA

MicroRNA Related Bioinformatics Issues• Secondary structure prediction• MicroRNA prediction and recognition• Target prediction

Databases

Databases• MicroRNA.org:

http://www.microrna.org/microrna/getMirnaForm.do• MirBase: http://microrna.sanger.ac.uk• …

Target prediction• MIRDB• TargetScan (http://targetscan.org)• PicTar (http://pictar.bio.nyu.edu)• miRanda (part of Sanger database)• MirTarget • …

Softwares• List at

http://en.wikipedia.org/wiki/List_of_RNA_structure_prediction_software

Non-coding RNA MicroRNA Related Bioinformatics Issues

MicroRNA prediction and recognition Second order structure prediction Target prediction

Microbial Related Bioinformatics Metagenomics

Other Omics

Other Informatics

Metagenomics study of genetic material recovered directly from

environmental samples a community of spieces – e.g., microbial from the

stomach of cow Challenges:

Who are there? How many?

16S riRNA – universal primer, highly conserved, used for profiling

forward: AGA GTT TGA TCC TGG CTC AG reverse: ACG GCT ACC TTG TTA CGA CTT

Next generation sequencing – more genes (chicken-and-egg)

Community metabolism – identify metabolic pathways within the community

New challenges: comparative study

Non-coding RNA MicroRNA Related Bioinformatics Issues

MicroRNA prediction and recognition Second order structure prediction Target prediction

Microbial Related Bioinformatics Metagenomics

Other Omics

Other Informatics