Comparative Sequence Analysis BioQUEST Workshop, Beloit, June 2004 Ivan Ovcharenko Lawrence...

download Comparative Sequence Analysis BioQUEST Workshop, Beloit, June 2004  Ivan Ovcharenko Lawrence Livermore National Laboratory.

If you can't read please download the document

description

The Genome Sequence: The Ultimate Code of Life >hg16_dna range=chr11: TCAGGAACTTTGAAATGTTTTAAAACCCCAACTTTCTCCCCCATTTAAAC AGGCGGATTCATCGGCACTGGCCACCATATGGGCCCTTGGAGATCTATTG AGATGACCACCAACACTTGAATAGCGAGGGGCTGCTTTTCAGCGCTGCAC AATGCCCCGCGAGTAAGGGAAACTATTAAACTCCTGGGGCAGGAGCGTTG GCAAACTTTCGTGGGCAGAATTTTGAGGCTACAATGAGCGCGGACAACAA AAGGATTCTCTTGAGGCGTGCAGCGGGCCACATTGTGTTACAAGAAGCCC AGTCAACAGACTTTTCAGTGAAGTGTGTTAACCCCTCTGCTCTGCTATCA TTAATCACTGTCCGAAGAGCGGGCGCCTCCGTGCTATTTAGGGCGCTTGG CTGGGGGGATGGAGGGTGGATGGGGGGGCCAGGGCCCAGCATGGGGGGAG GCAGGGAGAGTGGACGGGGACCAGGGCTGGGTTCCTACATAGAGGAGATG GAGGGGAGGCAGGATGGAAACCAGCGGTGGGGGTGGAAGCAAGGGGGAAG GATTGGGGGGCCTGGGTTAGGGGAAAGACAGAGGGCGATGGAGGGAAAAA GAGGGCGATGGAGGGGAAAAGAAGGCTCAAAAAACATAGAGGCTAGAAAG GTATTTTTAAAAAAGGACAGAAAAGAATGCTGAGAGGAAAAAGAGACACG AGGGCCGAACAAGAGTGGGAGAGAGAGGAAAAGGAGGATGAGGGCCAGAG AATATTAGTAACTGAGCCCCATCTGGACTCTGGGTCTTTGCACTCCATCA GAAAGGTGGGGGTCGAGGAGGGCTACTTAGCTGAGGGAGACGCGCTCCGC TCACGTGTGCGGGCACAAGCGTCTGTGCTAATTTACTGCCCCAAGTTTCC GGGGACTTTTCAAAGCGTTTTTCAAGGGAAGAAATGAAGCGACCACCCCC ACCCCTCGCTTTATTTTCGGGTTTGGTGAAGAAGGAAGACTGGAAATAGC TCCTTTTGGCCAACTAGAAAGGCCGGAGGGTTATTGCTTTTGGAAAACAG ACAAAAATCTGTGCACATCTGGTATGGGGTGGGGGACACTGAGGAGAACA CAATGCCCATCTCCCCATGGCCACTCATGCCCATGCCTTCCTAGGGGCCC CATCTCGGTCCCTTTTCTGGCACATTCGATCTCGCCAATTAAACAAAGTT GCCCGAATCTGCCTCCGAAGAACCCCGCCGATAGCATGCTCTGCTCTCAT TTGCCTCTTTGACATTTTCTTAATTTTAAAACATGGAGATTCACATTCTT ATCCATGTTCTGTCTCACACAAACATACACACGGGTTTACACAGGCAGCA CGCGATCGCCGCCAGGCCCTGTGCTGCCTCCAGAACTGACACTTAAGAGA GAAAAGTCAGCAGGGACAGTAGAGCTCAATTTTAAATCTGGAAAAAAAAA AAAAAAAAAAAAAGATGGGAAGCGGGGATTGGAATTCCACAGCAAAAAGA AACCTGTCGCTGCAGGATCCCTTCTCTACCCCGCGGGGAGAGCGGCACGG AGACAGTTCATTACTTTAGAAGTGGCAACTGTTTGCAGCCAGGCGGTGAC CTAGCGGCTGCTCTTACATAAAATGGGTACATTTCCCCCCACTTTAGTGG ATTTGCCTTCCACTCTTAAAGCTTTTAACAAAATAAAACTAGAAGTTGGA TCTCGACTCCCCCACCCCCACGATAAACCTAAGTGGTGGACAATTAAGAT ATCTTCTTCAAAAGGCGCCCCCTCGGAGCCGCGCAAAGCAGGGGCCTTCA GTGGGTGCCGTTCACCTTCCAGCCTAATCCGTGAGAAAGCGAGTGAAAGC GCCTCCCATTATCCCAGCCCCAGGACCATCTGACGATGGGAATAGGATTT GTTTCCTGGAAGGAGGTGAGAGAGAGAGAGAGAGAGAGAGACAGAGAGAG only 3% is coding for proteins the function of the rest ~47% (noncoding, nonrepetitive DNA) is unknown ~ 50% is junk (repetitive elements)

Transcript of Comparative Sequence Analysis BioQUEST Workshop, Beloit, June 2004 Ivan Ovcharenko Lawrence...

Comparative Sequence Analysis BioQUEST Workshop, Beloit, June Ivan Ovcharenko Lawrence Livermore National Laboratory Comparative genomics Evolution of noncoding elements Aligning vertebrate genomes Function of the human gene deserts Redefining comparative sequence analysis Phylogenetic shadowing Transcriptional gene regulation The Genome Sequence: The Ultimate Code of Life >hg16_dna range=chr11: TCAGGAACTTTGAAATGTTTTAAAACCCCAACTTTCTCCCCCATTTAAAC AGGCGGATTCATCGGCACTGGCCACCATATGGGCCCTTGGAGATCTATTG AGATGACCACCAACACTTGAATAGCGAGGGGCTGCTTTTCAGCGCTGCAC AATGCCCCGCGAGTAAGGGAAACTATTAAACTCCTGGGGCAGGAGCGTTG GCAAACTTTCGTGGGCAGAATTTTGAGGCTACAATGAGCGCGGACAACAA AAGGATTCTCTTGAGGCGTGCAGCGGGCCACATTGTGTTACAAGAAGCCC AGTCAACAGACTTTTCAGTGAAGTGTGTTAACCCCTCTGCTCTGCTATCA TTAATCACTGTCCGAAGAGCGGGCGCCTCCGTGCTATTTAGGGCGCTTGG CTGGGGGGATGGAGGGTGGATGGGGGGGCCAGGGCCCAGCATGGGGGGAG GCAGGGAGAGTGGACGGGGACCAGGGCTGGGTTCCTACATAGAGGAGATG GAGGGGAGGCAGGATGGAAACCAGCGGTGGGGGTGGAAGCAAGGGGGAAG GATTGGGGGGCCTGGGTTAGGGGAAAGACAGAGGGCGATGGAGGGAAAAA GAGGGCGATGGAGGGGAAAAGAAGGCTCAAAAAACATAGAGGCTAGAAAG GTATTTTTAAAAAAGGACAGAAAAGAATGCTGAGAGGAAAAAGAGACACG AGGGCCGAACAAGAGTGGGAGAGAGAGGAAAAGGAGGATGAGGGCCAGAG AATATTAGTAACTGAGCCCCATCTGGACTCTGGGTCTTTGCACTCCATCA GAAAGGTGGGGGTCGAGGAGGGCTACTTAGCTGAGGGAGACGCGCTCCGC TCACGTGTGCGGGCACAAGCGTCTGTGCTAATTTACTGCCCCAAGTTTCC GGGGACTTTTCAAAGCGTTTTTCAAGGGAAGAAATGAAGCGACCACCCCC ACCCCTCGCTTTATTTTCGGGTTTGGTGAAGAAGGAAGACTGGAAATAGC TCCTTTTGGCCAACTAGAAAGGCCGGAGGGTTATTGCTTTTGGAAAACAG ACAAAAATCTGTGCACATCTGGTATGGGGTGGGGGACACTGAGGAGAACA CAATGCCCATCTCCCCATGGCCACTCATGCCCATGCCTTCCTAGGGGCCC CATCTCGGTCCCTTTTCTGGCACATTCGATCTCGCCAATTAAACAAAGTT GCCCGAATCTGCCTCCGAAGAACCCCGCCGATAGCATGCTCTGCTCTCAT TTGCCTCTTTGACATTTTCTTAATTTTAAAACATGGAGATTCACATTCTT ATCCATGTTCTGTCTCACACAAACATACACACGGGTTTACACAGGCAGCA CGCGATCGCCGCCAGGCCCTGTGCTGCCTCCAGAACTGACACTTAAGAGA GAAAAGTCAGCAGGGACAGTAGAGCTCAATTTTAAATCTGGAAAAAAAAA AAAAAAAAAAAAAGATGGGAAGCGGGGATTGGAATTCCACAGCAAAAAGA AACCTGTCGCTGCAGGATCCCTTCTCTACCCCGCGGGGAGAGCGGCACGG AGACAGTTCATTACTTTAGAAGTGGCAACTGTTTGCAGCCAGGCGGTGAC CTAGCGGCTGCTCTTACATAAAATGGGTACATTTCCCCCCACTTTAGTGG ATTTGCCTTCCACTCTTAAAGCTTTTAACAAAATAAAACTAGAAGTTGGA TCTCGACTCCCCCACCCCCACGATAAACCTAAGTGGTGGACAATTAAGAT ATCTTCTTCAAAAGGCGCCCCCTCGGAGCCGCGCAAAGCAGGGGCCTTCA GTGGGTGCCGTTCACCTTCCAGCCTAATCCGTGAGAAAGCGAGTGAAAGC GCCTCCCATTATCCCAGCCCCAGGACCATCTGACGATGGGAATAGGATTT GTTTCCTGGAAGGAGGTGAGAGAGAGAGAGAGAGAGAGAGACAGAGAGAG only 3% is coding for proteins the function of the rest ~47% (noncoding, nonrepetitive DNA) is unknown ~ 50% is junk (repetitive elements) Comparative Sequence Analysis 1950 th 2000 th 1920 th 1880 th Biologically functional regions in the genome tend to stay conserved through the evolution. Therefore, by aligning homologous sequences from different, but related species we can identify Evolutionary Conserved Regions (ECRs) with a putative functional importance Evolution of the genomic code Genomic modifications empowered the evolution: mutations insertions / deletions duplications rearrangements Functional regions of the genome accumulated less mutations, Natural selection eliminated species with mutations altering the critical function of important elements Functionally important elements in the DNA stayed conserved through the evolution How to find evolutionary conserved elements? actgactgactgATATTGACAgtttgttgttgttaa agggacaaactgATATTGACAgt---ttgttgttaa aggg--aaactgATATTGACAgt---ttgaaattaa tggg--aaaccaATATTGACAgt-actcgaaattaa tggg--aaaccaATATTGACAgt-actcgaaatgta Millions of years of evolution A functional element Human ACTTTACGGGATCTATCTATACCGGTAACGTAATCCGATACCAGT Mouse ACTTTACGGGATCTCTCTATACCGGTAAAAAAAATTTAGT Sequence Alignment Human ACTTTACGGGATCTATCTATACCGGTA----ACGT-AATCCGATACCAGT ||||||||||||||:|||||||||||| |::| ||| Mouse ACTTTACGGGATCTCTCTATACCGGTAAAAAAAATTT AGT step 3 - insert gaps to linearize the alignment Human ACTTTACGGGATCTATCTATACCGGTAACGTAATCCGATACCAGT |||||||||||||| |||||||||||| Mouse ACTTTACGGGATCTCTCTATACCGGTAAAAAAAATTTAGT step 1 - find matches Human ACTTTACGGGATCTATCTATACCGGTAACGTAATCCGATACCAGT ||||||||||||||:|||||||||||| Mouse ACTTTACGGGATCTCTCTATACCGGTAAAAAAAATTTAGT step 2 - find mismatches Human ACTTTACGGGATCTATCTATACCGGTA----ACGT-AATCCGATACCAGT ||||||||||||||:|||||||||||| |::| ||| Mouse ACTTTACGGGATCTCTCTATACCGGTAAAAAAAATTT AGT CONSERVED DIVERGED Numeric criteria of conservation - minimal percent identity over minimal length Conserved Elements Current case: 95% / 30 bps Common criteria: 70% / 100 bps General: ???? Huge alignments How to use them efficiently? Human aaTtAAGGgTAAgTTTAcAtTGtttggAGCAAagGAaTAgcgATGcTCtCTTTGAATGAC | |||| ||| |||| | || ||||| || || ||| || ||||||||||| Mouse --TcAAGGcTAAaTTTAtAcTG----aAGCAActGAcTActaATGtTCcCTTTGAATGAC Human cAAGAgATTA---TTTTtAAATAAGcacCAAaTAcAAatAAAATgCtAtTgGCTAAAGTT |||| |||| |||| ||||||| ||| || || ||||| | | | ||||||||| Mouse tAAGAtATTActaTTTTgAAATAAGtgtCAAgTAgAAgcAAAATaCcAaTtGCTAAAGTT Human CAaTTtgTTTTgCATAcTTGTTTCTAATAAGgACAtAtGAgcCacAAAATaGCCAAAGGG || || |||| |||| |||||||||||||| ||| | || | ||||| ||||||||| Mouse CAgTTcaTTTTcCATAtTTGTTTCTAATAAGtACAcAcGActCttAAAATcGCCAAAGGG Human AGgGAAAAaaCCCTcAACtgCTAACAGCACATTAACAAAGTATAGAAAcGAAAGACACTT || ||||| |||| ||| |||||||||||||||||||||||||||| ||||||||||| Mouse AGaGAAAAg-CCCTgAACgtCTAACAGCACATTAACAAAGTATAGAAAgGAAAGACACTT Human TTCTTTGGATTTCAGCCTTGTCATTTCCAATTTTCTGCTCCTTGGACATGCTTGTATTCA |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Mouse TTCTTTGGATTTCAGCCTTGTCATTTCCAATTTTCTGCTCCTTGGACATGCTTGTATTCA Human AATTCTGGAACATCTATTCAGCATATCAATCCTAATTAGACAATCTGGGTCTGGAAAGGA |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Mouse AATTCTGGAACATCTATTCAGCATATCAATCCTAATTAGACAATCTGGGTCTGGAAAGGA Human TGaGAGCTGGGTCATTTGCATAATTTAATCATAAATACTCAGTGATACATATTTCCAAAT || ||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Mouse TGgGAGCTGGGTCATTTGCATAATTTAATCATAAATACTCAGTGATACATATTTCCAAAT Human GCATTTGTACAATTATCTTTTCATCCTTGGGGCAATGGTATTAATATGATTAGGCAATAT |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Mouse GCATTTGTACAATTATCTTTTCATCCTTGGGGCAATGGTATTAATATGATTAGGCAATAT Human TTCTGGAAAAAACAGACAAGTATGCACTCTTTTTAACTGCAGCTTAgGGCGATATGAAAA |||||||||||||||||||||||||||||||||||||||||||||| ||||||||||||| Mouse TTCTGGAAAAAACAGACAAGTATGCACTCTTTTTAACTGCAGCTTAaGGCGATATGAAAA Human ATTAATTAATTTCTGAAGAAAATCAATTTCTCTACGTGACCACATTAGACATtgCTAAAC |||||||||||||||||||||||||||||||||||||||||||||||||||| |||||| Mouse ATTAATTAATTTCTGAAGAAAATCAATTTCTCTACGTGACCACATTAGACATcaCTAAAC Human GTATtTGAACAGtTCAATAGAAAAaCTgGTAATGTATCAAAGAGCATCTTAAATTtTGAA |||| ||||||| ||||||||||| || ||||||||||||||||||||||||||| |||| Mouse GTATgTGAACAGcTCAATAGAAAAtCT-GTAATGTATCAAAGAGCATCTTAAATTgTGAA Human GAGATCtTtCTGCctACTTTCtTtTaggGCAcaCCaCTcTgCTTTACTTtaAtGcATTGT |||||| | |||| |||||| | | ||| || || | |||||||| | | ||||| Mouse GAGATC-TcCTGCtcACTTTCcTgTccaGCAttCCtCTtTcCTTTACTTagAgGaATTGT Human TATTTAACCAGTCAATGAGAAGtCTGtGCTTTtGGTGTGAACTCATCTtGAGTGATCTTT |||||||||||||||||||||| ||| ||||| ||||||||||||||| ||||||||||| Mouse TATTTAACCAGTCAATGAGAAGcCTGgGCTTTcGGTGTGAACTCATCTcGAGTGATCTTT Human TATTAATGTACATTAAcCAATTTCAAGGACAACAGGATAAGGTTACTTtTGAAagGCTTT |||||||||||||||| ||||||||||||||||||||||||||||||| |||| ||||| Mouse TATTAATGTACATTAAgCAATTTCAAGGACAACAGGATAAGGTTACTTcTGAAttGCTTT Human CTCAAGAAAtGGATTTATATTCaTCtAAAATAATCtTAAtTCACATGAcACTGTTTATtA ||||||||| |||||||||||| || ||||||||| ||| |||||||| ||||||||| | Mouse CTCAAGAAAcGGATTTATATTCtTCcAAAATAATCgTAAcTCACATGAgACTGTTTATcA Human t---tAAAAAAtTAGATAAaCcAAGTCcTCTTaAAAtGTAcCAtTtTCATAAGaAaAACa |||||| ||||||| | ||||| |||| ||| ||| || | ||||||| | ||| Mouse ggaagAAAAAAaTAGATAAgCtAAGTCaTCTTgAAA-GTAtCAcTgTCATAAGgAgAACg Human TTaTaAtATaCTtaGTgGAGctctAAGAACCCAGGTGGCTAATCTGA-TTTTTaAAAAAG || | | || || || ||| ||||||||||||||||||||||| ||||| |||||| Mouse TTgTcAcATtCTctGTaGAGacagAAGAACCCAGGTGGCTAATCTGAtTTTTTtAAAAAG Human AGATTCTGCTTTGTATGTTAATTAGTacaAAAGAAAGAAGTcaCATTTGTGAGTTTAAAT |||||||||||||||||||||||||| |||||||||||| ||||||||||||||||| Mouse AGATTCTGCTTTGTATGTTAATTAGTgacAAAGAAAGAAGTggCATTTGTGAGTTTAAAT Human gCACTATTCTTTtCcTTtCAATCaAatgAAAAAGTAGAAATTACTGCATGCAAATATTCA ||||||||||| | || ||||| | |||||||||||||||||||||||||||||||| Mouse aCACTATTCTTTcCtTTaCAATCgAgcaAAAAAGTAGAAATTACTGCATGCAAATATTCA 2 different ways to describe large genomic alignments Graphical conservation profiles Colored regions correspond to areas of evolutionary conservaiton 2. Smooth graphs 1. Percent identity plots 80%, 100bps alignment block Schwartz S, Zhang Z, Frazer KA, Smit A, Riemer C, Bouck J, Gibbs R, Hardison R, Miller W. Genome Research, 2000 PipMaker:Mayor C, Brudno M, Schwartz JR, Poliakov A, Rubin EM, Frazer KA, Pachter LS, Dubchak I. Bioinformatics 2000 Vista:Vertical coordinate gives an average percent identity in the window of 100bps centered at a given nucleotide From Comparative Genomics to Genome Biology 50% 100% RAD50 IL-13 KIF3 IL-4 Cyclin I-homolog ECR bp 84% Conserved Non-Transcribed SequencesEXONS 5q31 region 245 conserved elements 155 exons 90 noncoding (>70% >100bp) Experimental assesment of the biological function of evolutionary conserved regions LoxP ECR-1 LoxP IL 4IL 13 Removal of the ECR-1 from the mouse genome ECR-1 ECR-1 knockout wild type IL4 IL13 Rad50 IL5 ECR-1 10kb 6kb 120kb Loots et al., Science (2000) Expression of 3 cytokines reduced in ECR-1 knockout 0 Pg/ml -ECR1/-ECR1 0 Pg/ml 0 IL4IL13IL5 -ECR1/-ECR1 WT Aligning Vertebrate Genomes Human genome GENOME is a huge book of life: Sequence of 3 billion letters from a short, 4-letters alphabet: A, C, T, and G 24 CHROMOSOMES form chapters: Average chromosome is ~100 million letters long TTATCTTTCAAGATTTTAAAGGTGTTCCTAATATTTTACACAAAAGCATG AGCACTAGATATGGTTGCAAAATACTGGGTGATGAGTTATACTGCCATTC TCTGCTTTCCTGTGAACTCCTTTATTTGTATAGTAGCTATATGCTCAGAC GTTGAAAATATAAGAAGTGAAGTACCCTGAAAAGTATCACATGATGGCAC TGTTTCCATTTCCACATCCAATATTATGAAATAAAGCTATAATAAACTGG TATTAAGAATGGGGTATAATGCCAGTGTATTTTGTATAATTTATGTAAAA TAAAAATCTAACCACTATGGTTATTAATATGGGTACTAAAGTGAATTCAT AGATTTTTCACAAAATGTTTTGTAAAAGCTTGCATTTCTATAATGTCTAT AATTTAGATCACAAAGAAACAATTTATCTAGATATTAACAATTTTAGTAA CACGGAAAACAGCTTCATTAATTACTTGAGTTGCTTTACAAACTATTTTT TAAAATAGTATATTTTATGTTATATTTCAGTTTTAATTGGGAAGAAATAA CGCTGTATCATACATGAGATTTATCTGTGGCAAATATGACCATTTGCATG GAATTATTTCCGAAGAATGCAAAGAAAGTGTATAAATAATATTGAAAAGT ACATGGATCAGTGGTTGAAGGGATCAAGCACAATTTTAAAGTGAACAAAA TTTAAATGTGGCCAACCTGAATATTTAAAGGGTTCATTAATCTGAGAAAT GTAAATGTTAAATGGTGTGTGATTTCAACTACCATTATTTATTATGGTAA ACAGTCTTTCCTATATAATAGGCATGAAAAAATGGTGTGGAGTGATTATC ATCTCAGGAATGAGAGTACAATAATTTTCTATTCCTAACAAAAAAGAAAA AAAAATGATCAAAATGTGATGTGATATATAGTGAAGTACTATGTAGATGT GGATGTTTAAAGATGAACCAAGCATCAGGATTTCACCAAATTTTATCTAT AATAATGAATTAATAATAGTGGATATAGATACATCTTCCCAGTGGCATGA GTGTGGTAAAAAAGATACAAAGCTCTATGGACTTGAAATGATGCCCCTCT AGTGATGTTAAAGAACCTAATGGCCAGAATTTGGAAGTGCAGCAAGTGAG TGCTGTAAGAATATTTTTAAATGTGATCAGTTTATATTTGTTTTAATATG ACAGAAAAAATACTTTGCACAATTTTCCTTTTAATTCATCTGTGAACTTG TCTCGGGGGGAAAACATACATGTGAAGTGTTCTTACTGTATTCTTTTAAA AATAAATATGAAAAATAATCATGCAGGTAAACCAATTCCAAATATTTATC TTAACGACATCCCCAAAATCTTAAAGGTATATACTAGGCATAAACCTTAA ACCTTTAATCACAGTGGAGATAAATTCCTCCTACAAAAAGAAATGTGTAA AGTAGAACTAACTATTCTGATATATTATTCTATGTAATCATTTCTCAAGT CTGTCTTTAAACAAATAGTTACATCTTATTATAAAGACAATAAATAAATA CATTTTCCTAGAAATCCATCTTGAAATAAGGATTTCTTGCACCCTAGTTT CAAGAATACACTGGTGTCCTATCACCTCCTTTGGGAAAGTGACAGTTTGC ATAATACTTTTCACATAAGAGAAAAATTTAAATAATGATATTGAGGAAAT TGTTGAAACATTGCCTAATGGTATAGTAACAAAAAGTATTCATAAATCTG TACTGTAGAAGAGAAAATATACACTACAATAATCTGTTCATTTGTCTTAG AAGAGGGGAGAAAAAAACCCAGAATACTGAAATAGGAAATTTCCATGTTC ACTGTATTTCACCATGCAAATCACTTGCAATTTCCAAATGCCAGTGTTAC TTTTCAGGACAAATTTCACACAAAAGGAATTCAGTGATTATTCATCCAGT TTAATAATTCAATTAAATAAGTCTGATGCTGTCAGGTGTTCTTTTAATAA GENES are short stories: Every chromosome has ~1,000 genes Every gene has a function in the human body Sequenced vertebrate genomes human mouse rat fugu zebrafish tetraodon ~ 400 MY ~ 80 MY x4 x0.5 x2 x0.5 Comparing Genomes 1. Mask out repetitive elements (RepeatMasker) 2. Map syntenic regions in two genomes (BLAT) 3. Align syntenic regions (BLASTZ) 4. Visualize alignments (ECR Browser) Times to align human and mouse genomes (3Gb vs 3Gb) Mapping/AligningLocationTime Blastz/BlastzUCSC Genome Browser1000 days (3 years) Blat/AvidVista Genome Browser1 month Blat/BlastzECR Browser