Y.B. Lebedev et al. Gene 247 (2000) 265–277

download Y.B. Lebedev et al.  Gene 247 (2000) 265–277

of 13

Transcript of Y.B. Lebedev et al. Gene 247 (2000) 265–277

  • 8/12/2019 Y.B. Lebedev et al. Gene 247 (2000) 265277

    1/13

    Gene 247 (2000) 265277

    www.elsevier.com/locate/gene

    Differences in HERV-K LTR insertions in orthologous loci ofhumans and great apes

    Yuri B. Lebedev a,*, Oksana S. Belonovitcha, Natalia V. Zybrova a, Paul P. Khila,Sergey G. Kurdyukova, Tatyana V. Vinogradova a, Gerhard Hunsmann b,

    Eugene D. Sverdlova

    aShemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of Sciences, 16/10 Miklukho-Maklaya St., Moscow 117871, Russia

    bGerman Primate Centre, Department for Virology and Immunology, Kellnerweg 4, D-37077 Goettingen, Germany

    Received 23 August 1999; received in revised form 2 December 1999; accepted 25 January 2000

    Abstract

    The classification of the long terminal repeats (LTRs) of the human endogenous retrovirus HERV-K (HML-2) family was

    refined according to diagnostic differences between the LTR sequences. The mutation rate was estimated to be approximately

    equal for LTRs belonging to different families and branches of human endogenous retroviruses (HERVs). An average mutation

    rate value was calculated based on differences between LTRs of the same HERV and was found to be 0.13% per million years

    (Myr). Using this value, the ages of different LTR groups belonging to the LTR HML-2 subfamily were found to vary from 3 to

    50 Myr. Orthologous potential LTR-containing loci from different primate species were PCR amplified using primers corresponding

    to the genomic sequences flanking LTR integration sites. This allowed us to calculate the phylogenetic times of LTR integrations

    in primate lineages in the course of the evolution and to demonstrate that they are in good agreement with the LTR ages calculated

    from the mutation rates. Human-specific integrations for some very young LTRs were demonstrated. The possibility of LTRs and

    HERVs involvement in the evolution of primates is discussed. 2000 Elsevier Science B.V. All rights reserved.

    Keywords: HERV-K; HML-2; Human endogenous retroviruses; Human genome; LTR; Primate evolution

    1. Introduction genomic variations with interspecies differences at the

    level of expressed proteins/enzymes including tissue spec-

    ificity and/or inducibility of various genes. ChimpanzeeRapid progress of the Human Genome Project and

    related achievements in the development of technologies (Pan troglodytes) and man (Homo sapiens) have been

    thoroughly compared, both as the organisms and usingfor gene identification, mapping and sequencing have

    opened new horizons for revealing the molecular events available biochemical and genetic data, and since the

    that underlie the processes of speciation and, in particu- classical work of King and Wilson (1975) it is widely

    lar, the genetic causes of great apes divergence in the accepted that human proteins and genes are basically

    evolution. One of the most exciting questions in this 99% identical to their chimpanzee counterparts. This

    field is what differences between human and ape genomes remarkably low level of difference allowed King andmake these related species so phenotypically different. Wilson (1975) to formulate a concept of regulatoryTo answer this question one has to compare the human evolution suggesting that a relatively small number ofgenome with the genomes of ape species such as orang- genetic changes in systems controlling the expression ofutan, gorilla and, of course, the closest human relative, genes may account for major organisational differenceschimpanzee. The next step would be to associate the between human and chimpanzees. Since that time, new

    highly efficient techniques of structural analysis of pro-Abbreviations: HERV(s), human endogenous retrovirus(es); teins and nucleic acids have been developed, and a large

    LTR(s), long terminal repeat(s); Myr, millions years (ago); PCR, poly- number of new structures were compared having con-merase chain reaction; TEs, transposable elements.

    firmed that homologous, orthologous sequences of* Corresponding author. Tel.: +7-095-330-6992;

    human and chimpanzee are indeed at least 98.5% iden-fax: +7-095-330-6538.E-mail address:[email protected] ( Y.B. Lebedev) tical [see comment in Gibbons (1998)]. It has been

    0378-1119/00/$ - see front matter 2000 Elsevier Science B.V. All rights reserved.

    P I I : S0 3 7 8 - 1 1 1 9 ( 0 0 ) 0 0 0 6 2 - 7

  • 8/12/2019 Y.B. Lebedev et al. Gene 247 (2000) 265277

    2/13

    266 Y.B. Lebedev et al./Gene 247 (2000) 265277

    found, however, that there are quite a number of 1995; Leib-Mosch and Seifarth, 1996; Lower et al., 1996;

    Patience et al., 1997) and now occupy up to 1% of thequalitative differences between the two genomes, includ-

    ing the absence of some chimpanzee DNA stretches human genome. HERVs, being various in primary struc-

    tures and abundance, are thought to have been insertedfrom the human genome, and vice versa. The differences

    include: (i) differences in chromosome organisation into the germ-line at different times between

  • 8/12/2019 Y.B. Lebedev et al. Gene 247 (2000) 265277

    3/13

    267Y.B. Lebedev et al. /Gene 247 (2000) 265277

    Table 1

    Oligonucleotide primers used to amplify HERV sequences

    No. Sequence (53) Designation

    Suppression set

    1 GTAATACGACTCACTATAGGGCAGCGTGGTCGCGGCCGAGGT T7Not1 suppression adapter

    2 ACCTCGGC Not2 adapter

    3 GTAATACGACTCACTATAGGGC T7 A1-primer4 AGCGTGGTCGCGGCCGAGGT Not1 A2-primer

    Primers specific for U3 termini of the LTRs

    5 TGTTTTTGTGAGCTCAAGGTTGGG 192oR T2-primer

    6 TGTTTCAGAGAGCACGGGGTTGGG 192yR T2-primer

    7 AACCTTGATTCAATACAACACATG 214oR T1-primer

    8 AACCCTGAGTTGACACAGCACATG 214yR T1-primer

    Primers specific for U5 termini of the LTRs

    9 TCCTCCRTATGCTGAACGCTGGTTCC 915yF T1-primer

    10 TATGCTGAGCGCCGGTCCC 922oF T1-primer

    11 TGAGCGCCGGTCCCCTGGGCC 927oF T2-primer

    12 TGAACGCTGGTTCCCTGGGCC 927yF T2-primer

    Loci-specific primers

    AGTCTGACAGGAATGGAACTGC ltr12-F

    CACCACTGCCAGCTCAATC ltr12-RCTCAATCCATTGCACACTGC ltr18-F

    GGTGGAAATTGTGGCCTG ltr18-R

    ATGCTCGAAACTACCTGCACTT ltr30-F

    ATTATGCAACCTGGGTCTGTCA ltr30-R

    CGTGCTAAGAGTTATCCACACC ltr31-F

    TGTGTATTTGCTCACTCGCTG ltr31-R

    GCTGGAATGGAGGTATTATTGT ltr32-F

    AAAGTAACTGCCACTTGTGAAAC ltr-32-R

    GGCTGGCTTTTCAGGTCG ltr41-F

    GTCAGTGGCTGCCTGCTGATTTG ltr41-R

    GTGTTTGAGAAGCTCCTGCC ltr47-F

    AATCGAGGAACCGGAAGTG ltr47-R

    TTCAAGCAGGAAGTCACC ltr50b-F

    ACACATGGCGTGTAAAGTC ltr50b-R

    CATGGGGAGACAAGCCATC ltr69-FTGTTGGCCTCAGCGTACC ltr69-R

    AAATGACTGATACTAATCCAACCAC ltr70-F

    TGGCAGGGACACAGTGAGG ltr70-R

    CTCCCATTTTAATTTAGCACCG 2508-F

    CCTTTGACCTGTTGAAGTGATG 2508-R

    CCTGGCATACAACACTTAACGT 0041-F

    CAGGGCCAGGATTTGAAC 0041-R

    CCAGTGCCACAAGGTCAG 5612-F1

    CCGATTCCCCATTCATTCCAG 5612-F2

    AAGAATGGCAGCGTTGATG 5612-R1

    GTTGATGCCTGTCCCTCTGCC 5612-R2

    TTGGGATGACCAGTAACCG 6684-F1

    AGGGAACCAGCGCACACAGC 6684-F2

    CATCTCTGGGCTAAGGCATC 6684-R1

    TCAGTCCCACAAAGGCATCAGT 6684-R2

    2.3. Preparation of adapter-ligated DNA carried out in 30ml of a buffer containing 50 mM Tris

    HCl, pH 7.6, 10 mM MgCl2

    , 0.5 mM ATP, 10 mM

    dithiotreitol, 2 mM adapter (oligonucleotides 1 and 2,500 ng of cosmid DNA was digested in 50ml of the

    restriction buffer containing 20 u EcoRI, PstI or AluI Table 1) and 5 u T4 DNA ligase (Life Technologies).

    Samples were incubated at 16C overnight, and therestriction enzymes at 37C for 90 min, and further

    incubated for 90 min after addition of 10 u of fresh reactions were terminated by heating the reaction mix-

    tures at 75C for 5 min. DNA was then separated fromrestriction enzyme. The termini of the fragments were

    filled in with the Klenow fragment of the DNA polymer- the primers with a QIAquick DNA Purification Kit

    (Qiagen, CA) and eluted with 50 ml of sterile water.ase under standard conditions. Ligation to adapters was

  • 8/12/2019 Y.B. Lebedev et al. Gene 247 (2000) 265277

    4/13

    268 Y.B. Lebedev et al./Gene 247 (2000) 265277

    2.4. LTR-flanking DNA amplification by PCRSpecies No. of samples

    35 ng of adapter-ligated DNA was amplified using Homo sapiens 20Pongidea and Hylobatea5 pmol of each A1 and T1 primer and AmpliTaq-DNA

    Chimp ( Pan troglodytes) 5Polymerase (PerkinElmer Cetus) in a standard PCRGorilla (Gorilla gorilla) 4medium: 50 mM KCl, 10 mM TrisHCl (pH 9.0),Orang-utan ( Pongo abelii) 3

    2.5 mM MgCl2, 0.2 mM each of dNTPs in a final volume

    Gibbon ( Hylobates syndactylus) 1of 25ml. Primers 7, 8, 9, or 10 (Table 1) were selected ( Hylobates lar) 2Old World monkeyaccording to the priming direction and the LTR struc-

    ( Macaca arctoides) 1ture. Thermocycling conditions were 30 s denaturation( Macaca mulatta) 2at 94C, 30 s annealing at 60C, 40 s elongation at 72C,( Mandril lus sphinx) 2

    17 cycles (thermocycler OmniGene, Hybaid, UK ). The( Papio hamadryas) 3

    PCR products obtained after the first PCR step were (Colobus quereza) 1New World monkey1000-fold diluted and amplified in the second PCR

    (Callimico goeldii ) 2round using 10 pmol of A2 and T2 primers (oligonucleo-(Callithrix pigmae) 2tides 5, 6, 11, or 12 depending on the primers used for( Saimiri sciureus) 2

    the first round of PCR). The conditions for PCR were

    the same as in the first round, except that the number

    of amplification cycles was 2025. The resulting PCR

    products were analysed by electrophoresis in a 2% 3. Results and discussionagarose gel.

    3.1. Average ages of different HERV-K (HLM-2) LTR

    branches2.5. Sequencing

    Phylogenetic analysis shows that most of HERVsTemplates for sequencing were obtained at the second

    entered the genome early in primate evolution: most ofround of PCR amplification using A2 and T2 primers.

    HERV families entered and/or were amplified in thePCR products were purified using a QIAquick-spin PCR

    germ line after the separation of Old and New WorldPurification Kit (QIAGEN ). PCR fragments were

    monkeys. Therefore, their age can be estimated as 30sequenced manually with fmol Sequencing System

    50 Myr (Leib-Mosch and Seifarth, 1996; Lower et al.,(Promega) using A2 and T2 primers labelled with [c-32

    1996). However, some HERV-related sequences wereP]-ATP and polynucleotide kinase. Complementary

    detected in New World monkeys, and they are olderstrand sequences were aligned using the DNAsis and than 45 Myr (Simpson et al., 1996). There are dataGene Runner programs.

    indicating that some of HERVs might have been integ-

    rated into the genome even earlier, more than 60 Myr

    ago, i.e. before the divergence of prosimians and New2.6. Sequence analysis

    World monkeys (Anderssen et al., 1997). In our previous

    work we classified human HERV-K retroviral LTRsLTR flanking sequences were analysed by advanced

    BLAST and standard Repeat Masker programs. LTR into groups according to their divergence (Lavrentieva

    et al., 1998). The accumulation of new data on the LTRsearching and extraction, preparation of LTRs, align-

    ment, and its refinement using Clustal, GDE, GeneDoc sequences in databases allowed us to improve the previ-

    ous classification and to identify additional branchesand Phylip programs were done as described previously

    ( Lavrentieva et al., 1998 ). (see Table 2 ). For example, a previously uniform K

    branch was subdivided into two closely related, though

    different branches K1 and K2 having common and2.7. Genomic PCRdistinct diagnostic substitutions. The improvement made

    the branches more homogeneous thus allowing us to10 ng of DNA purified from human or primate blood

    deduce more reliable consensus sequences.samples was PCR amplified (2732 cycles) using 10 pmol

    As a result, more exact intragroup divergences wereeach of specific primers and AmpliTaq-DNA Polymerase

    calculated. These divergences can be used to calculate(PerkinElmer Cetus).

    the age of the branch ancestor (master or source) gene,

    the retropositions of which gave birth to the branch

    members. Such calculations are possible providing that2.8. Primate sequences analysed

    the average rate of divergence is the same for different

    branches as observed for other retroelements. AnotherDNA templates purified from the following species

    were obtained through the Gene Bank of Primates. prerequisite is the knowledge of the LTRs mutation

  • 8/12/2019 Y.B. Lebedev et al. Gene 247 (2000) 265277

    5/13

    269Y.B. Lebedev et al. /Gene 247 (2000) 265277

    Table 2

    Calculated average divergences within HERV-K LTR subfamilies and branches, and estimated time of their ancestors insertiona

    Subfamily Average internal Branch No. of LTRs Average internal Estimated time (T)

    de signa tion diverge nce (%) designation in branch divergenceD (%) of master gene insertion (Myr)

    LTR I 21 I-E 17 (11 )b 8.7 33.5

    I-D 3 6.5 25.0

    I-S 8 10.4 40.0I-P 7 9.2 35.4

    I-K1 7 11.2 43.1

    I-K2 10 13.8 53.1

    I-Y 12 10.7 41.2

    I-A 17 11.3 43.5

    I-I 5 6.1 23.5

    I-X 5 7.7 29.6

    LTR II c 5 II-H 4 6.5 25.0

    II-O 6 8.6 33.1

    II-N 6 5.7 21.9

    II-V 14 8.2 31.5

    II-T 12 4.2 16.2

    II-L1 5 2.7 10.4

    II-L2 6 2.5 9.6

    II-L3 10 1.6 6.2II-L4 6 (4 )b 0.9 3.5

    aThe mutation rate of 0.13%/Myr was used for the group age calculations by the formulaT=D/20.13, where D is the divergence value (%)

    and T is the time (Myr) passed since the integration event. The factor 2 is used because the average divergence values presented in the Table 2

    correspond to the average of differences between each two LTRs in a given group, which is expected to be two times higher than the average

    number of mutations accumulated by each of the LTRs during its evolution.

    bTwo numbers are given in this cell: the total number of entries and the number of sequences except recombinant ones (in brackets).

    cAn analysis revealed an additional group called II-B consisting of three LTRs with an internal divergence of 11.9%. However, based on

    diagnostic mutations, this group can be assigned to the subfamily II of LTRs having considerably lower intragroup divergence. The reason(s) for

    such a discrepancy remains unknown, and possibly the group will be further split into several subgroups as more LTR sequences become available.

    At this point we do not consider this group.

    rate. Unfortunately, the mutation rates vary consider- tion rates if the ERV insertion time is known or can be

    estimated.ably for different genome constituents. For instance, the

    rate of 0.273%/Myr was determined fora-enolase pseu- Table 3 demonstrates examples of divergences

    between 5- and 3-LTRs flanking an ERV sequence anddogene, whereas quite different rates were reported for

    other pseudogenes: 1.26% for c-globin, 0.43% for a corresponding mutation rates calculated for the HERV-

    K(C4) element located in the complement system C4lactate dehydrogenase, and 0.1% for a- and b-globin

    [reviewed in Minghetti and Dugaiczyk (1993)]. Finally, loci and detected in the human genome as well as in the

    same sites of the genomes of some other primatesBritten (1994) used the mutation rate for Alu repeats of

    0.130.16%/Myr. Authors working with HERVs often (Dangel et al., 1995). The divergences between 5- and

    3-LTRs of the HERV-K(C4) were calculated for humanuse different mutation rates for HERV age estimations.

    For instance, Mager and Freeman (1995), Anderssen (9.1%), orang-utan (10.1%), and Old World monkeys

    (10.5%) (Dangel et al., 1995). According to the authors,et al. (1997), and we in our previous work used the

    rates 0.12%/Myr, 0.2%/Myr and 0.26%/Myr respec- the virus integrations have occurred after the divergenceof New World monkeys, i.e. around 45 Myr ago (seetively. Here we estimated an average mutation rate for

    LTRs using well-documented divergences either among footnote c to Table 3). Therefore, the mutation rates

    can be calculated from these data by the formula D/2T,retroviral LTRs belonging to the same retrovirus when

    the time of integrations was known from phylogenetic where D is the divergence value (%) and T the time

    (Myr) passed since the integration event, i.e. in this caseanalysis, or among orthologous LTRs in different

    species. In the first case the LTRs were probably identical 45 Myr. The factor 2 in the denominator is used because

    the differences between the LTRs are the sum of mut-at the time of the retroelement integration (Dangel et al.,

    1995), and then independently accumulated mutations, ations in both LTRs.

    On the other hand, the divergences between ortholo-the number of which should increase with time passed

    since the insertion event. Therefore, the differences gous LTRs of HERVs in different species (i.e. between

    3-LTRs or between 5-LTRs integrated in the samebetween them can be used for calculations of the muta-

  • 8/12/2019 Y.B. Lebedev et al. Gene 247 (2000) 265277

    6/13

    270 Y.B. Lebedev et al./Gene 247 (2000) 265277

    Table 3

    Intra- and inter-species percent divergence of the HERV-K(C4) LTRs (Dangel et al., 1995) and calculated mutations rates (%/Myr)a

    Speciesb Hsa(3) Ppy(5) Ppy(3) OWm(5) OWm(3)

    Has(5) 9.1 (0.10) 2.4 (0.09) 7.2 (0.13)

    Has(3) 5.5 (0.21) 8.7 (0.16 )

    Ppy(5)c 10.1 (0.11) 8.6 (0.15)

    Ppy(3) 8.7 (0.16 )OWm(5) 10.5 (0.12 )

    aThe figures in brackets represent mutation rate values (% per Myr) calculated asD/2T(see explanation in Section 3.1).

    bHsa, humans; Ppy, orang-utan; OWm, Old World monkey (Dangel et al., 1995).

    cThe branching data for primate evolution were averaged from three estimates (Sibley and Ahlquist 1987; Britten, 1994; Takahata and Satta,

    1997): New World monkeys 45 Myr; Old World monkeys 28 Myr; gibbon 18 Myr; orang-utan 13 Myr; gorilla 8 Myr; chimpan-

    zee 5.6 Myr.

    positions in the human and ape genomes) allow one to discussed in a previous paper (Lavrentieva et al., 1998),

    the estimates in Table 2 suggest that the LTRs of theestimate independently the mutation rate using the same

    youngest group II-L emerged in the human genomeformula but with a T value corresponding to the time

    about 36 Myr ago, approximately at the time ofpassed since the splitting of the species under compari- branching between hominid and chimpanzee lineages.son. For example, assuming that the branching time for

    Other groups were integrated in the genome earlier, atOld World monkeys is 28 Myr ago, the divergence of

    different times (Table 2). We suggest that many of the7.2% between 5-LTRs in humans and Old World mon-old LTRs should be in orthologous positions in thekeys corresponds to the mutation rate value ofgenomes of all hominoids, though one could also expect0.13%/Myr. The figures calculated in this way are pre-some differences in integration sites among differentsented in Table 3.primate species in the case the retropositions of theseSimilar data taken from Mager and Freeman (1995)old representatives continued after branching events.for the LTRs belonging to the same proviruses of theMoreover, the human genome is supposed to contain aHERV-H superfamily allowed us to obtain values ofgreater proportion of the youngest groups members.0.1% for HERV-H(cH-4) and 0.12% for RTVL-H3. TheTo check these indirect conclusions, and to confirm theaverage mutation rate of 0.13%/Myr obtained for LTRspresence or absence of the LTR in the site, a sequencefrom all the data above is very close to the rate for Alu

    analysis of the LTR integration sites in different speciessequences (Britten 1994). We also calculated the muta-is required. Accordingly, we sequenced flanking DNAstion rate using HERV-K LTR intrabranch averageof some LTRs integrated in different positions of thesequence divergences between LTRs belonging tohuman genome. In addition, LTR flanking sequencesdifferent HERVs reported by Medstrand and Magercan already be found in databases. These sequences(1998). The division of the divergences by the phyloge-were used for PCR analysis of the genomic DNAs fromnetic time of the corresponding group emergence gavedifferent primate species.an independent evaluation of the average mutation rate

    value of 0.12%/Myr, which is in good agreement with

    the values obtained above. 3.2. Isolation of the LTR flanking sequencesMedstrand and Mager (1998) demonstrated that

    intrabranch divergences were roughly proportional to The procedure used for the isolation of flankingthe group age. This important observation means that sequences was based on the PCR-suppression effect

    different branches evolved at similar rates, i.e. the major- (PS-effect; Siebert et al., 1995) (see Fig. 1). LTR-con-ity of branch members in different branches were under taining cosmid DNAs were digested with a restrictionthe same selective pressure in the genome. Using the enzyme (R) and tagged with an adapter pair of comple-value of 0.13% for the LTR mutation rate, the ages of mentary oligonucleotides (1 and 2, Table 1) of unequaldifferent branches were calculated (Table 2). It appeared lengths. After filling in with DNA polymerase the ter-that, along with very old branches (as old as 50 Myr), mini of each DNA restriction fragment ligated to thethere were also young groups aged about 36 Myr, adapters are converted into inverted repeats (Fig. 1, line(L4)(L3) respectively. A broad spectrum of ages was B). In this way single-stranded restriction fragmentsalso demonstrated for the HERV-H LTR superfamily possess self-complementary termini capable of forming

    (Anderssen et al., 1997), and for another set of the intramolecular stemloop structures ( Fig. 1, line C ).

    HERV-K (HML-2) family LTRs (Medstrand and Moreover, the ligated adapter is a GC-rich long oligonu-

    cleotide (40 nt) that facilitates and strengthens theMager, 1998). Though the time uncertainty is high, as

  • 8/12/2019 Y.B. Lebedev et al. Gene 247 (2000) 265277

    7/13

    271Y.B. Lebedev et al. /Gene 247 (2000) 265277

    Fig. 1. A scheme of using PCR suppression effect for amplification of LTR flanking regions. ( A) Schematic representation of a genomic DNA

    region containing an LTR. Vertical lines marked with R designate restriction endonuclease recognition sites; grey boxes positions of LTRs. (B)

    DNA fragments with ligated suppression adapters. Open boxes designate short oligonucleotides complementary to the 3-end of a 40 nt T7Not1

    suppression adapter; two light-shaded boxes mark the parts of the adapter corresponding to A1 and A2 ( Table 1 ) primers. (C ) Pan-handle structures

    formed by single-stranded DNA fragments arising at the denaturation step. Dark-shaded boxes designate the ends filled in by Taq DNA polymerase

    before the first denaturation step and complementary to the adapter. Positions of A1 and T1 primers in the pan-handle structure are indicated by

    the arrows with corresponding symbols. (D) PCR fragments with different termini formed through amplification with T1 and A1 primers. (E )

    PCR fragments obtained by nested PCR using A2 and T2 primers.

    sticking of its self-complementary ends. Obviously, PCR These PCR products are not subject to the PS-effect

    and thus can be efficiently amplified using the A1 andamplification of such DNA fragments using A1-primer

    corresponding to the outermost parts of the termini will T pair of primers. To increase the specificity of the

    amplification, nested PCR with A2 and T2-primers wasbe suppressed. However, the PCR will take place with

    the simultaneous use of two primers: A1-primer and used (Fig. 1, line E). This mechanism ensures the effi-

    cient amplification of only those fragments that containT-primer complementary to a single-stranded part of

    the stemloop structure of the fragment ( Fig. 1, line C, the targeted sequence.

    Fig. 2 shows an example of the two-step PCR. Theleft). In this case newly synthesised PCR products will

    have two different, not self-complementary termini technique allowed us specifically to produce LTR flank-

    ing sequences for primary structure determination.unable to form stemloop structures (Fig. 1, line D).

  • 8/12/2019 Y.B. Lebedev et al. Gene 247 (2000) 265277

    8/13

    272 Y.B. Lebedev et al./Gene 247 (2000) 265277

    A

    B C

    Fig. 2. Specific PCR amplification of the genomic DNA flanking the LTR within R30306 cosmid. Cosmid R30306 containing a Chr19q12 DNA

    fragment with LTR 31 (see Fig. 3) was digested with EcoRI restriction enzyme, and the restriction fragments were ligated to the adapter (oligonucleo-

    tides 1 and 2, Table 1). (A) A schematic representation of an LTR with its U3, R and U5 and genomic flanking regions. The designations of the

    A primers used for the two-step PCR amplification correspond to those in Fig. 1. The LTR-specific primers used for the amplification of the U3

    and U5 flanks are marked as T1r, T2r and as T1f, T2f respectively. (B) Specific PCR amplification of the LTR flanking region adjacent to the U3

    terminus of the LTR. In the first PCR step, the A1-primer (oligonucleotide 3, Table 1 ) corresponding to the 5-outermost part of the ligated adapter,

    and the T1r-primer targeted at the U3 region of the LTR (oligonucleotide 8, Table 1; Fig. 1) were used. The PCR product generated in the firstPCR step (column R1) was re-amplified with T2r (oligonucleotide 6, Table 1) and A2-primers (oligonucleotide 4). The resulting PCR product is

    shown in column R2. (C ) The specific amplification of the U5 flanking region of the LTR. The same as in ( B) but with primers T1f (oligonucleotide

    9, Table 1) and T2f (oligonucleotide 12, Table 1) corresponding to the U5 region of the LTR. The first (R1 column) and the second (R2 column)

    step products of the PCR are shown. The figures in the right part of the gel images indicate the molecular masses of the marker (M).

    3.3. Sequence features of the LTR flanks of interspersed repeated elements like Alu, LINE or

    some retroviral genes on one or both sides of the LTR.

    At the moment only five of the 15 LTRs analysed hereThe LTR flanking sequences obtained (Fig. 3) could

    be subdivided into two categories: (A) rather long appeared to be integrated into unique genomic

    sequences. Having analysed the data available in theunique sequences on both sides of the LTR and (B)

    sequences containing representatives of known families databases we found that about 75% of LTRs detected

  • 8/12/2019 Y.B. Lebedev et al. Gene 247 (2000) 265277

    9/13

    273Y.B. Lebedev et al. /Gene 247 (2000) 265277

    Fig. 3. Strategy of LTR flanks sequencing. DNA restriction fragments used as templates for PCR amplification are designated by bold horizontal

    lines, the sites ofEcoRI (RI), AluI (AI) or PstI (PI) are indicated at the corresponding ends of the fragments. Grey boxes show positions of

    LTRs; U3, R and U5 regions of LTRs are indicated within the boxes. Repetitive elements within DNA fragments are designated by variously

    shaded boxes, their types being indicated under the boxes. Arrows designate directions of sequencing and their lengths approximate the lengths of

    the sequences obtained. Lengths of several large LTR flanking fragments are indicated under the corresponding parts of the fragments. Locationsof LTRs on human chromosomes are shown at the right ends of DNA fragments. The designations of the LTRs are shown in brackets on the

    right of the figure. Locations of sequences homologous to retroviral gagand env genes are indicated.

    in the human genomic sequences were integrated coinci- repeats in the human genome were also noticed earlier

    by us and other authors (Baban et al., 1996; Khil et al.,dentally with other repeats, whereas only 25% were

    integrated at the distances longer than 200 bp from 1997). The extent of reiterative integrations into pre-

    existing elements is sometimes very impressive. Forother known repeats ( P. Khil, unpublished results). This

    distribution reflects the known trend of retroposons to example, over 50% of the maize genome is represented

    by retroelements (Kidwell and Lisch, 1997). However,be reiteratively inserted into or next to pre-existing

    retroelements ( Kidwell and Lisch 1997). Frequent coin- the maize genome was not knocked out because highly

    repetitive elements were mostly targeted at intergenecidences of HERVs and their LTRs with Alu andLINE

  • 8/12/2019 Y.B. Lebedev et al. Gene 247 (2000) 265277

    10/13

    274 Y.B. Lebedev et al./Gene 247 (2000) 265277

    Fig.4.ExamplesofPCRamplificationsof

    genomicDNAsfromdifferentprimatespecieswiththeprimerscorrespondingtosequenc

    esflankingindividualLTRsinthehumange

    nome.Achromosome

    19ideogram

    withtheHERV-K

    (HML-2)LTRsmapped(triangles)isdepictedinth

    eupperpartofthefigure.TheresultsoftheLTR-containinglociamplificationarea

    sfollows.(A)LTR41

    (Table4)containinglocus.TheF1(LTR4

    1-F,Table1)andR1(LTR41-R,Table1)p

    rimersforamplificationareshownasarrow

    stogetherwiththelengthsoftheLTRflank

    sincludingtheprimer

    sequences.(B)LTR50b(Table4)locus.TheprimersforamplificationwereF2(LTR50b-F,Table1)andR2(LTR50b-R,Table1).Otherdesignationsasin(A).(C)LTR7

    0locus(Table4).The

    primersforamplificationwereF3(LTR70-F,Table1)andR3(LTR70-R,Table1).Ex

    pectedlengthsofthePCRproductsforLTR

    sintegratedinthelociaremarkedoverthedouble-headedarrows.

  • 8/12/2019 Y.B. Lebedev et al. Gene 247 (2000) 265277

    11/13

    275Y.B. Lebedev et al. /Gene 247 (2000) 265277

    Table 4

    Integration of individual HERV-K (HLM-2) LTRs in primate genomes

    LTR numbera LTR branch Accession Chromosome location Primate speciesb Integration timec (Myr)

    Hu Ch Gor Oran Gib OWm NWm

    II-L3 AC002508 7q31.17q31.2 +

  • 8/12/2019 Y.B. Lebedev et al. Gene 247 (2000) 265277

    12/13

  • 8/12/2019 Y.B. Lebedev et al. Gene 247 (2000) 265277

    13/13

    277Y.B. Lebedev et al. /Gene 247 (2000) 265277

    Lower, R., Lower, J., Kurth, R., 1996. The viruses in all of us: charac-of primates jointly supported by the Department ofteristics and biological significance of human endogenous retrovirusMedical Genetics of the Ludwig-Maximilians-sequences. Proc. Natl. Acad. Sci. USA 93, 51775184.

    University, Munich and the German Primate Centre,Mager, D.L., Freeman, J.D., 1995. HERV-H endogenous retroviruses:

    Goettingen. presence in the New World branch but amplification in the OldThe authors thank Dr B. Glotov for helpful discus- World primate lineage. Virology 213, 395404.

    Medstrand, P., Mager, D.L., 1998. Human-specific integrations of thesions and assistance in the manuscript preparation.HERV-K endogenous retrovirus family. J. Virol. 72, 97829787.The work was supported by grants 98-04-48798 of

    Mighel, A.J., Markham, A.F., Robinson, P.A., 1997. Alu sequences.the Russian Foundation for Basic Research, HHMIFEBS Lett. 417, 15.

    International Research Scolars award 75195-544201, Miller, M., Zeller, K., 1997. Alternative splicing in lecithin:cholesteroland partly supported by the Human Genome State acyltransferase mRNA: an evolutionary paradigm in humans and

    great apes. Gene 190, 309313.Project of Russia and INTAS-96-1710.Minghetti, P.P., Dugaiczyk, A., 1993. The emergence of new DNA

    repeats and the divergence of primates. Proc. Natl. Acad. Sci. USA

    90, 18721876.References

    Nickerson, E., Nelson, D.L., 1998. Molecular definition of pericentric

    inversion breakpoints occurring during the evolution of humansAnderssen, S., Sjottem, E., Svineng, G., Johansen, T., 1997. Compara- and chimpanzees. Genomics 50, 368372.

    tive analyses of LTRs of the ERV-H family of primate-specific Patience, C., Wilkinson, D.A., Weiss, R.A., 1997. Our retroviral heri-retrovirus-like elements located from marmoset african green tage. Trends Genet. 13, 116120.monkey and man. Virology 234, 1430. Schulte, A.M., Lai, S., Kurtz, A., Czubayko, F., Riegel, A.T.,

    Baban, S., Freeman, J.D., Mager, D.L., 1996. Transcripts from a novel Wellstein, A., 1996. Human trophoblast and choriocarcinoma

    human KRAB zinc finger gene contain splicedAluand endogenous expression of the growth factor pleiotrophin attributable to germ-retroviral segments. Genomics 33, 463472. line insertion of an endogenous retrovirus. Proc. Natl. Acad. Sci.

    Benit, L., Lallemand, J.B., Casella, J.F., Philippe, H., Heidmann, T., USA 93, 14 75914764.1999. ERV-L elements: a family of endogenous retrovirus-like ele- Schwartz, A., Chan, D.C., Brown, L.G., Alagappan, R., Pettay, D.,ments active throughout the evolution of mammals. J. Virol. 73, Disteche, C., McGillivray, B., de la Chapelle, A., Page, D.C., 1998.33013308. Reconstructing hominid Y evolution: X-homologous block, created

    Britten, R.J., 1994. Evidence that most human Alu sequences were by XY transposition, was disrupted by Yp inversion throughinserted in a process that ceased about 30 million years ago. Proc. LINELINErecombination. Hum. Mol. Genet. 7, 111.Natl. Acad. Sci. USA 91, 61486150. Sibley, C.G., Ahlquist, J.E., 1987. DNA hybridization evidence of

    Britten, R.J., 1996. DNA sequence insertion and evolutionary varia- hominoid phylogeny: results from an expanded data set. J. Mol.tion in gene regulation. Proc. Natl. Acad. Sci. USA 93, 93749377. Evol. 26, 99121.

    Britten, R.J., 1997. Mobile elements inserted in the distant past have Siebert, P.D., Chenchik, A., Kellogg, D.E., Lukyanov, K.A., Lukya-taken on important functions. Gene 205, 177182.

    nov, S.A., 1995. An improved PCR method for walking in unclonedChou, H.H., Takematsu, H., Diaz, S., Iber, J., Nickerson, E., Wright,

    genomic DNA. Nucleic Acids Res. 23, 10871088.K.L., Muchmore, E.A., Nelson, D.L., Warren, S.T., Varki, A.,

    Simpson, G.R., Patience, C., Lower, R., Tonjes, R.R., Moore, H.D.,1998. A mutation in human CMPsialic acid hydroxylase occurred Weiss, R.A., Boyd, M.T., 1996. Endogenous D-type (HERV-K)after the HomoPan divergence. Proc. Natl. Acad. Sci. USA 95,

    related sequences are packaged into retroviral particles in the pla-11 75111756.

    centa and possess open reading frames for reverse transcriptase.Dangel, A.W., Baker, B.J., Mendoza, A.R., Yu, C.Y., 1995. Comple-

    Virology 222, 451456.ment component C4 gene intron 9 as a phylogenetic marker for

    Smit, A.F.A., 1996. The origin of interspersed repeats in the humanprimates: long terminal repeats of the endogenous retrovirus ERV-

    genome. Current Opin. Genet. Dev. 6, 743748.K(C4) are a molecular clock of evolution. Immunogenetics 42,

    Steinhuber, S., Brack, M., Hunsmann, G., Schwelberger, H., Dierich,4152.

    M.P., Vogetseder, W., 1995. Distribution of human endogenousGibbons, A., 1998. Which of our genes make us human? Science 281,

    retrovirus HERV-K genomes in humans and different primates.14321434.Hum. Genet. 96, 188192.Harris, J.R., 1998. Placental endogenous retrovirus ( ERV ): structural,

    Takahata, N., Satta, Y., 1997. Evolution of the primate lineage leadingfunctional and evolutionary significance. Bioessays 20, 307316.to modern humans: phylogenetic and demographic inferences fromKhil, P.P., Kostina, M.B., Azhikina, T.L., Kolesnik, T.B., Lebedev,DNA sequences. Proc. Natl. Acad. Sci. USA 94, 48114815.Y.B., Sverdlov, E.D., 1997. Structural characteristics of four long

    Trask, B.J., Friedman, C., Martin-Gallardo, A., Rowen, L., Akinbami,terminal repeats (LTR) of human endogenous retroviruses and fea-

    C., Blankenship, J., Collins, C., Giorgi, D., Iadonato, S., Johnson,tures of their integration sites. Russ. J. Bioorg. Chem. 23, 406 411.F., Kuo, W.L., Massa, H., Morrish, T., Naylor, S., Nguyen, O.T.,Kidwell, M.G., Lisch, D., 1997. Transposable elements as sources ofRouquier, S., Smith, T., Wong, D.J., Youngblom, J., van den Engh,variation in animals and plants. Proc. Natl. Acad. Sci. USA 94,G., 1998. Members of the olfactory receptor gene family are con-77047711.tained in large blocks of DNA duplicated polymorphically near theKing, M.C., Wilson, A.C., 1975. Evolution at two levels in humansends of human chromosomes. Hum. Mol. Genet. 7, 1326.and chimpanzees. Science 188, 107116.

    Vinogradova, T., Volik, S., Lebedev, Y., Shevchenko, Y., Lavrentyeva,Lavrentieva, I., Khil, P., Vinogradova, T., Akhmedov, A., Lapuk, A.,I., Khil, P., Grzeschik, K.H., Ashworth, L.K., Sverdlov, E.D.,Shakhova, O., Lebedev, Y., Monastyrskaya, G., Sverdlov, E.D.,1997. Positioning of 72 potentially full size LTRs of human endoge-1998. Subfamilies and nearest-neighbour dendrogram for the LTRsnous retroviruses HERV-K on the human chromosome 19 map.of human endogenous retroviruses HERV-K mapped on humanOccurrence of the LTRs in human gene sites. Gene 199, 255264.chromosome 19: physical neighbourhood does not correlate with

    Yoder, J.A., Walsh, C.P., Bestor, T.H., 1997. Cytosine methylationidentity level. Hum. Genet. 102, 107116.

    and the ecology of intragenomic parasites. Trends Genet. 13,Leib-Mosch, C., Seifarth, W., 1996. Evolution and biological signifi-

    cance of human retroelements. Virus Genes 11, 133145. 335340.