Post on 27-Sep-2020
Volume u Number 5 1986 Nucleic Acids Research
Sequence heterogeneity within the human alphoid repetitive DNA family
P.Devilee, P.Slagboom, C.J.Comelisse1 and P.L.Pearson
Department of Human Genetics and 'Department of Pathology, University Medical Center, Leiden,The Netherlands
Received 26 November 1985; Revised and Accepted 13 February 1986
ABSTRACT.We have cloned and determined the base-sequence and genome organization of
two human chromosome-specific alphoid DNA fragments, designated LI.26, mappingprincipally to chromosomes 13 and 21, and LI.84, mapping to chromosome 18.Their copy number is estimated to be approximately 2,000 per haploid genome.LI.84 has a double-dimer organization, whereas LI.26 has a much less definedhigher order tandem organization. Further, we present evidence that therestriction-site spacing within the alphoid DNA family is chromosome specific.From sequence analysis, clones LI.26 and LI.84 are found to consist of 5 and 4tandealy duplicated 170 bp monomers. Cross-homology between the variousmonomers is 65-85%. The analysis suggests that the evolution of tandem-arraysdoes not take place via a defined 340 bp unit, as was inferred by others, butvia circularly permutated monomers or multimers of the 170 bp unit.
INTRODUCTION.
Restriction enzyme analysis has shown that human DNA contains many families
of repeated DNAs (1-4). These differ from one another with respect to genomic
organization, repeat-lengths and copy number. The Alul-family, for example,
comprises approximately 300,000 copies of a short (300 bp) sequence, inter-
spersed among stretches of unique sequences or genes (4). In contrast, the
Kpnl family is Interspersed throughout the genome in longer fragments with a
lower repetition frequency (3).
The alphoid DNA family (5), so termed because of its homology to the alpha
component in the African Green Monkey (6) Is an example of a different type of
organization, characterized by long arrays of tandemly repeated 170 bp units.
It is commonly referred to as "satellite"-DNA, although it is distinct in
sequence from any of the human satellite DNA peaks obtained after isopycnic
centrifugations (7). In man, the alphoid family forms a pronounced 340 bp and
680 bp band in ethidium bromide stained gels of EcoRI digested genomic DNA.
When partial digests are analyzed by Southern blotting, using the 340 bp frag-
ment as probe, a "ladder" of bands is observed, the steps of which correspond
to multiples of 170 bp (8). Densitometer scanning suggests that the 340 bp
© I R L Press Limited, Oxford, England. 2059
at Leiden University on A
ugust 17, 2011nar.oxfordjournals.org
Dow
nloaded from
Nucleic Acids Research
band represents O.75Z of the genome, corresponding to about 55,000 copies (8).
Recently, it has become apparent that the alphoid DNA family can be divided
into subfamilies, some of which may be characteristic of specific chromosomes.
Thus, the EcoRI-dimer, described by Manuelidis (9) is located predominantly in
the centromere regions of chromosomes 1,3,7,10 and 19. A 2.0 kb BamHI-fragment
is similarly specific for the X-chromosome (10), while a 5.5 kb EcoRI-fragment
characterizes the Y-chromosome (11). We have isolated two alphoid sequences,
designated LI.26 and LI.84, and have found them to be principally localized to
the pericentric regions of chromosomes 13 and 21 (LI.26) and chromosome 18
(LI.84) (12). In this article we present the sequence analyses of LI.26 and
LI.84 which show that they consist of 5 and 4 tanderaly organized alphoid sub-
units respectively, each approximately 170 bp long. Within the 170 bp units,
some regions appear conserved while others are more variable.
Comparison of chromosome specific members of the alphoid DNA family will
give insight into the evolutionary constraints Imposed on DNA sequences adja-
cent to the centromere.
MATERIALS AND METHODS.
DNA sources and preparations.
Genomic DNA was isolated from cell lines or lymphocytes as described (13).
Recombinants LI.26 and LI.84 were selected from a random human recombinant
DNA-library (14), containing EcoRI-inserts from DNA restricted to completion
cloned in plasmid pAT153. Plasmid-DNA was prepared according to the methods of
Maniatis et al. (15).
Cell lines.
Human-rodent somatic cell hybrids were obtained as described earlier (16).
A hamster hybrid cell line with the X-chromosome as the only retained human
material, was a kind gift of Dr. S. Goss, Dunn School of Pathology, Oxford.
Southern blotting and hybridizations.
Genomic DNA was digested with restriction enzymes as recommended by the
supplier, in a final volume of 20 /il. To ensure complete restriction, a three-
fold excess of enzyme-units was added. Digestion was monitored in a parallel
aliquot with phage lambda-DNA as internal marker. After three hours of incuba-
tion, the samples were Incubated 10 minutes at 65'C, and loaded and electro-
phoresed on 0.8% agarose gels in Tris-acetate. The separated DNA was blotted
onto nylon filters (Gene Screen, New England Nuclear) using standard proce-
dures (17). Overnight hybridization and subsequent washing of the filters was
performed at 65'C as described by Jeffreys and Flavell (18). The hybridiza-
2060
at Leiden University on A
ugust 17, 2011nar.oxfordjournals.org
Dow
nloaded from
Nucleic Acids Research
tion-mixture contained 20 mM Tris-HCl pH 7.5, 2 mM EDTA, 3x SSC (0.45 M NaCl,
0.045 H Na-citrate), 0.1 mg/ml salmon sperm DNA, lOx Denhardts Solution, (0.2%
ficol, 0.2% BSA, 0.2% polyvinylpyrolidon), 0.1Z SDS, 5% dextran-sulphate and 532
ng/ml P-nicktranslated probe (19). Exposure was at -70'C on Sakura film
backed by an intensifying screen.
Sequencing-strategy.
Both recombinants LI.26 and LI.84 were sequenced using the dideoxy chain
termination method (20). The inserts were recloned into the EcoRI-slte of
M13mp8 (21), and single-strand recombinant phages cultured, containing op-
posite insertstrands. Thus, 250 bp from each end could be sequenced. The inner
200 bp of LI.84 were sequenced from an EcoRI-Rsal fragment (fig.5) subcloned
in M13mpl0. Subcloning of LI.26 was as follows. The recombinant pAT153 plasmid
was linearized at the Hindlll site. The insert contains no Hindlll sites. In
the resulting linear fragment, the insert Is at one extreme end. This was
treated for various times with the exonuclease Bal-31 (Boehringer Mannheim),
and the DNA-fragments blunt-ended with DNA polymerase I (Klenow-fragment,
Boehringer). Subsequent digestion with EcoRI yielded EcoRI-blunt Insert frag-
ments progressively shortened from one EcoRI-site by approximately 150 bp.
These were also subcloned in M13mpl0 and single-stranded phages isolated (21).
The sequencing reactions were carried out with S-dATP (Amersham, 600
Ci/mmol) as label (22), using the New England Nuclear protocols. A 17-base
primer was the kind gift of dr. Van Boom, University of Leiden, The Nether-
lands. Deoxy-and dideoxy nucleotides were supplied by Boehringer Mannheim.
Sequencing-gels were dried on a BioRad slabgel-dryer and exposed overnight at
room-temperature on Sakura X-ray film.
RESULTS.
LI.26 and LI.84 belong to the alphoid DNA family.
Two independent dot-blot experiments are shown in fig.l, from which the
copy number of LI.26 is estimated to be approximately 2,000 per haploid
genome. It is unlikely that sequences with less than 95% homology will be
detected under the applied hybridization conditions (see Materials and
Methods; final washing of the filter in O.lx SSC). Densitometer scans (not
shown) indicate that approximately 60T of hybridizing signal is contained in
EcoRI fragments localized on chromosomes 13 and 21 (fig.4, and ref.12). The
results with LI.84 (not shown) were similar. When total human genomic DNA is
partially digested with EcoRI, blotted and hybridized with either LI.26 (fig.
2, panel A) or LI.84 (panel B) , a ladder of bands is formed early during
digestion. The lengths of these bands correspond to multiples of approximately
2061
at Leiden University on A
ugust 17, 2011nar.oxfordjournals.org
Dow
nloaded from
Nucleic Acids Research
a
1 m
2#
3 •
4 •
5 •
6
7
8
Figure 1. Dot-blot experiment using LI.26, recloned in M13mp8,as probe. The equivalent of 100 (1); 250 (2); 500 (3); 1,000(4); 2,500 (5); 5,000 (6); 7,500 (7) and 10,000 (8) copies ofthe insert-fragment of LI.26 per haploid genome was spotted induplo (a,b) on nylon Gene Screen membrane. One «g of totalgenomic DNA (46,XX) was spotted eight times as a reference (c).Probe was labeled by primer extension (ref.21). Filter wasexposed for 4 hr.
a b c d e f a b c d e
-12-
- ID-
'S -
- 6-
Flgure 2. Southern analysis of partial digests of 5 wg/lane total genomic DNAobtained with EcoRI, using LI.26 (panel A) or LI.84 (panel B) as probe. Num-bers between the panels indicate multiples of 170 bp. Extent of digestionincreases progressively from lane a (1/16 x complete) to f (2 x complete) inboth panels. Exposure time was two days.
2062
at Leiden University on A
ugust 17, 2011nar.oxfordjournals.org
Dow
nloaded from
Nucleic Acids Research
kb4.4-
3.3-
a
• *
b c d e f g h
-•in2.1-
1.1-
0.66->
Figure 3. Southern blot of total genomic DNA hybridized with LI.26.Restriction enzymes used are: TaqI (a); Xbal (b); Kpnl (c); Haelll (d);EcoRI (e); BamHI (f); Hindlll (g) and Bgl II (h). Exposure was overnight.
170 bp, indicating that LI.26 and LI.84 belong to the tandemly organized
alphoid DNA family. In completely digested samples, LI.26 and LI.84 hybridize
to the same ladder-pattern, but with different intensities per band. The
largest detectable multimer is a 16-mer in both instances; a fraction of
alphoid DNA remains resistant to EcoRI-restriction. Although the 340 bp band
is very pronounced in ethidium bromide stained gels, it hybridizes weakly with
either probe.
Apparently, LI.84 is organized predominantly as a tetramer; multiples
thereof (8-,12-,16-,20-mer) are detected early in the course of digestion
(Panel B, lane b) and the 4-mer and 8-mer are major bands in completely diges-
ted samples (Panel B, lane f). LI.26 does not cross-hybridize significantly
with the tetrameric higher order multimers of LI.84 (Compare lanes b, both
panels). Its organization is more complex: several longer multimers appear
simultaneously (Panel A, lane c) and are converted with different kinetics
into 8-, 5- and 4-mers. LI.26 contains a 5-mer, which Is 0.85 kb in length,
2063
at Leiden University on A
ugust 17, 2011nar.oxfordjournals.org
Dow
nloaded from
Nucleic Acids Research
M a b c B a b c a b c d e f
I2.1
1.1
kb
23-
1.5-
1.1-
0.7-
2.1-
i
-4F
Figure 4. Southern analysis of 10 /'g of hybrids Cl 2D (panel A) and 34-2-3 B3(panel B), containing chromosome X and chromosome 13 as their only retainedhuman material respectively, using LI.26 as probe. Digestions were with EcoRI(panel A, lane a; panel B, lane c), BamHI (panel A, lane c; panel B, lane a)or with both (panels A and B, lanes b). 'M' is marker. Panel C shows partialdigestion of total genomic DNA with BamHI (5 /Jg/lane). Extent of digestion in-creases progressively from lane a (1/16 x complete) to lane f (2 x complete).Exposure times: Panel A six days, Panel B three days, Panel C two days.
and LI.84 contains a 4-mer, 0.68 kb in length (see below). Digestion of
genomic DNA with other endonucleases further supports the tandem organization:
most enzymes produce ladders with the same fragment-lengths as EcoRI (fig.3).
Specific repeat-lengths reside on specific chromosomes.
Although LI.26 is mainly restricted to chromosomes 13 and 21 (12), it also
detects homologous alphoid sequences on the X-chromosome. The organization,
however, of alphoid sequences on chromosomes X and 13 is clearly different and
indicates that they followed a distinct evolutionary history. When LI.26 is
hybridized to a hybrid cell line containing a single human X-chromosome, it is
found that the tandem-structures are organized in large EcoRI fragments of
2064
at Leiden University on A
ugust 17, 2011nar.oxfordjournals.org
Dow
nloaded from
Nucleic Acids Research
2-10 kb (fig.4, panel A, lane a). In contrast, in a chromosome 13 hybrid,
EcoRI-fragment8 are mostly 0.68 and 0.85 kb in length, with a few multimers up
to 1.7 kb (panel B, lane c), consistent with the overall organization of LI.26
(fig.2). Similarly, BamHI sites are almost absent from LI.26 and its homologs
on chromosome 13 (fig.4, panel B, lane a), while the homologs on the X-chromo-
some all show up as BamHI-multimers of 2.1, 3.0 and 4.0 kb (panel A, lane c).
Subsequent digestion with EcoRI reduces these multimers somewhat and yields a
1.6 kb fragment (Panel A, lane b). Partial digests of genomic DNA with BamHI
(fig.4, panel C), shows that the organization of LI.26 homologs in the total
genome is similar to their organization on chromosome 13 or the X-chromosome:
either in very large fragments or in tandems of approximately 2 kb units.
Heterogeneity at the sequence level.
1. Organization.
The base sequence of both LI.26 and LI.84 is presented in fig.5; their
respective lengths are 849 bp and 684 bp. Both sequences reflect the tandemly
repeated organization of alphoid DNA. Most restriction-sites appear with a 170
bp spacing: Hinfl at positions 288, 629 and 798 in LI.26; Ddel at positions
43, 213, 379 and 547 in LI.84. A comparison with two monomeric EcoRI-units
reported by Wu and Manuelidis (1), termed here a-1 and a-2, shows that the
homology with these monomers starts at an EcoRI-like recognition sequence at
position 26 in LI.26 and position 39 in H.84 (elaborated in fig.6B). This
shift in EcoRI restriction sites results In the last 142 bp of both sequences
being homologous to the first 142 bp of a-1 and a-2. Similarly, the first 25
bp of LI.26 and LI.84 are homologous to the last 25 bp in both the a-repeats.
Between these regions H e several complete monomers, 4 in LI.26, 3 in LI.84.
Their lengths are 171 bp, or slightly less, the shortest unit being 166 bp
within LI.84 (position 209-374). However, the new phase of EcoRI sites with
respect to the a-1 and a-2 units does not break up the typical 170 bp spacing
of EcoRI sites In alphoid DNA. Fig.6A shows the position of average nucleotide
homology relative to EcoRI restriction sites in LI.26 and LI.84 relative to a
hypothetical tandem-structure of a-1 units.
An interesting feature of the sequence of LI.84 is the presence of a 14 bp
direct repeat at position 16 (arrows fig.5). If a-1 and/or a-2 were tandemly
repeated within LI.84, this small perfect direct repeat would be located at
the border between two of the repeated units, thereby disturbing their con-
tiguous organization. This direct repeat is the reason why LI.84 itself is
somewhat longer than 4 times 170 bp.
2065
at Leiden University on A
ugust 17, 2011nar.oxfordjournals.org
Dow
nloaded from
Nucleic Acids Research
L.1.2610 _A 20 ENDjf 1 T Q 40 50 60 70 V H 80 90
AATTCAAATA AAAGGTAGAC AGCAGCATTC TCAGAAATTG CTTTCTGATG TCTGCATTCA ACTCATAGAG TTGAAGATTC CCTTTCATAG
100 110 120 130 140 150 161 170 180AGCAGGTTTG AAACACTCTT TCTGGAGTAT CTGGATGTGG ACATTTGGAG CGCTTTGATG CCTACGGTGG AAAAGTAAAT ATCTTCCCAT
190 f2 ^ 210 220 T R T D 2*° 2 5 0 2 6 ° 2 7 °AAAAACGAGA CAGAAGGATT CTCAGAAACA AGTTTGTGAT GTGTGTACTC AGCTAACAGA GTGGAACCTT TCTTTTTACA GAGCAGCTTT
2 8 0 T H 3 0 0 3 1 0 3 2 0 3 3 0 3 4 0 3 5 0 X T 3 6 0GAAACTCTAT TTTTGTGGAT TCTGCAAATT GATATTTAGA TTGCTTTAAC GATATCGTTG GAAAAGGGAA TATCGTCATA CAAAATCTAG
f3 3 8 0 390 400 4 1 0 4 2 0 4 3 0 4 4 0 4 5 0ACAGAAGCAT TCTCACAAAC TTCTTTGTGA TCTGTGTCCT CAACTAACAG AGTTGAACCT TTCTTTTGAT GCAGCAATTT GGAAACACCT
4 6 0 4 7 0 480 490 500 510 520 x ^ 5 3 0 * 4
TTTGGTAGAA AATGTAAGTG GATATTTGGA TAGCTTAACG ATTTCGTTGG AAACGGGAAT ATCATCATCT AAAATCTAGA CAGAAGCACT
5 5 0 5 6 0 5 7 0 580 5 9 0 6 0 0 610 6 2 0 T HATTAAGAAAC TACTTGGTGA TATCTGCATT CAAGTCACAG AGTTGAACAT TCCCTTACTT TGAGCACGTT TGAAACACTC TTTTGGAAGA
6 4 0 6 5 0 660 670 680 690 700 ^5i"°'w^D ""''^ATCTGGAAGT GGACATTTGG AGCGCTTTGA TGCCTTTGGT GAAAAGGAAA CGTCTTCCAA TAAAAGCCAG ACAGAAGCAT TCTCAGAAAC
7 3 0 R T 7 4 0 7 5 0 760 7 7 0 7 8 0 7 9 0 T H 8 1 0TTGTTCGTGA TGTGTGTACT CAACTAAAAG AGTTGAACCT TTCTATTGAT AGAGCAGTTT TGAAACACTC TTTTTGTGGA TTCTGCAAGT
8 2 0 8 3 0 8 4 0 849GGATATTTGG ATTGCTTTGA GGATTTCGTT GGAAGCGGG
L1.8410 ENDJ 30 | 1 T o 50 60 70 80 90
AATTCATCAA ATTGCAGACT GCAGCGTTCA GACTGCAGCG TTCTGAGAAA CATCTTTGTG ATGTTTGTAT TCAGGACACC AGAGTTGAAC
100 IIO T H I 2 0 130 140 1 5 0 160 170 ISOATTCCCTATC ATAGAGCAGG TTTGAATCAC TCCTTTTGTA GTATCTGGAA GTGGACATTT GGAGGCTTTC AGGCCTATGT TGGAAAAGGA
190 2 0 0 f 2 T D 220 230 RT 240 T D 250 260 270
AATATCTTCC ATAACAACTA GACAGAAGCA TTCTCAGAAC TTATTTGAGA TGTGTGTACT CACACTAAGA GAATTGAACC ACCGTTTTGA
2 8 0 2 9 0 T H 310 3 2 0 3 3 0 3 4 0 3 5 0 3 6 0AGGAGCAGTT TTGAAACACT CTTTTTCTGG AATCTGCAM GTGGATATTT GGCTAGCTTT GGGGATTTCG CTGGAACGGA ATACATATAA
3 7 0 f 3 T D 390 400 410 420 430 440 450
AAAGCACACA GCAGCGTTCT GAGAAACTGC TTTCTGATCT TTGCATTCAA GTCAAAAGTT GAACACTCCC TTTCATAGAG CAGTCCTGAA
4 6 0 4 7 0 4 8 0 4 9 0 WD 5 1 0 5 2 0 5 3 0 T HACACTCTTTT GTAGTATCTG GAACTGGACT TTTGGAGCGC TTTCAGGGCT AAGGTGAAAA AGAAATATCT TCCCATAAAA ACTGGACAGA
#4<notfuilm«th> 560 570 580 590 600 610 620 630ATCATTCTCA GAAACTrGTT TATGCTGTAT CTACTCAACT AACAAAGTTG AACCTTTCTT TTGATAGAGC AGTTTTGAAA TGCTCTTTTT
T H MO 6 5 0 6 6 0 6 7 0 6 8 0GTGGAATCTG CAAGTGGATA TTTGGTTAGT TTTGAGGATT TCGTTGGAAG CGGG
Figure 5. Complete nucleotlde sequences of LI.26 and LI.84. Numbered arrowsabove the sequences indicate the s t a r t of homology to the 171 bp alphoidreported consensus sequence ( r e f . l ) . 'END' marks the end of homology to thelas t 25 bp of the l a t t e r . Res t r ic t ion-s i tes are indicated by f i l led t r i ang les .These a r e : AccI (A); Ddel (D); Hinfl (H); Rsal (R) and Xbal (X). A 14 bpdirect repeat at position 16 in LI.84 i s indicated by horizontal arrows.
2. CroBS-homologies.
We have aligned the monomer units of LI.26 and LI.84 and compared them to
several reported alphoid units (Fig.6B and Table I ) . These are a-l and a-2
(1) ; a-X, the consensus sequence from the human X-chromosome (23); a-Y, a
monomer or iginat ing from the Y-chromosome (11); and SPC-1, a monomer detected
on small polydisperse circular DNA (24) . The cross-homology between the units
2066
at Leiden University on A
ugust 17, 2011nar.oxfordjournals.org
Dow
nloaded from
Nucleic Acids Research
E
LIH E i t :
E ~
•
E
|26
39
171
—171 —
171
»
1,E E E
U 2 .E
E
U2 .! E
E
79
10 20 30 40 50 60 70 80 90 100GATTCTCAGA AACTCCTTTG TGATGTCTCC GTTCAACTCA CACAGTTTAA CCTTTCTTTT CATAGACCAG TTAGGAAACA CTCTGTTTCT AAACTCTCCA
LI
LI
LI
a-Xa-Y
SPC-1. 2 6 - 1
- 2- 3
- 5. 8 4 - 1
- 2- 3- 4
a-2a - la-Xa-Y
26—1-2- 3-4- 5
8 4 - 1-2- 3-4
CCTACTC
CC C ACCGCCCc
GG
C
T AG
G
G
110AGTGGATATT
A
A T
C
G
C
C
CT
TT CA
TT C C
AGT
ACTAC
TG C
ATTA
TC A
120CAGACCTCTT
TCAG G
TG H
TC G CT
T TTG
TG TAG
TG G G
TG TTG
TG-CAGG
TC CTAC
TC G G
TC TTACT
A
C
ATA
C
A C
T
A
CAT
r cT AC
T CC
T AC
T A
T
CT AC• • *
130
TTT
G
CT
AA
ACCA
•4
A
A
140TGACGCCTTC GTTGGAAACC
A C
A - C
T
CG
C
TmT
AAT G
AT TC
LI
ATAAT
AT
ATGAT
T
AT
AA
G
LI
CG
A
.26
-
.84
-
TA
A
T
A
G
A
GGGGC
CCG
GG
A GGG
150CCATT-TCTT
AV iA 1
START.
c
AC
A:
START:A
c-AA
G
AAA
-
^ G
K A
am LI .t A
\ A
- A - A -
K A
:EHB L I .
GA C
A C
TA C
GA
A
C C
c cACCC
m
A
AA
160CATATTATG-
C
- t
ACAAA
AA
I A
CAC A
CA A
26
AAAAAA
AAA
AATAATAAG
A CA A
C—
AA
CC A
84
CAAAAAAAA
C
G
G T
C T
T C
G GC
G
C AG
C
TA C
AC
.
170C T A J C A C A C A A
AC
c
c- -
c cG
C
CG
C
ACTTTM
GC
G
C
GT
T
T
TTTTTTTT
TTTTTT
TT
TT
G T TG A T A
TC GT A T
T C
G
r G
rrr cr ,
p o s i t i o n ( f i g . 5 )
1-2526-196
197-367368-536537-607608-849
1-2739-208
209-374375-542543-684
GG A
G A
C A
G A
CTACC T
G A
CC T
GTAGG A
GTJ
GC I
Figure 6. A. Position of average nucleotide homology relative to restrictionsites in LI.26, LI.84 and a-X relative to a hypothetical tandem-structure ofa-l units. B. Comparison of the monomer sequences of LI.26, LI.84, the humana-dimer (ref.l), a-X (ref.23),a-Y (ref.ll) and SPC-1 (ref.24). Comparison ismade relative to a-2. Only bases which differ from this sequence are shown.Deletions are indicated by (-), positions where more than three base changesoccur by (•). For maximum alignment, bases 18, 80, 243, 310 and the 14 bprepeat were deleted from LI.84, base 14 was deleted from LI.26. Monomer num-bers of LI.26 and LI.84 correspond to numbering in fig.5.
of our probes and a-l and a-2 was found to vary between 687 and 827 with an
average of 757 and 737 for LI.26 and LI.84 respectively (Table I). Slightly
lower homologies were noted between our probes and SPC-1 (727 and 707 resp.),
and a-Y (717 and 727 resp.), whereas the a-X sequence seems somewhat more
related at approximately 807 homology in both instances. Between LI.26 and
LI.84 there exists an overall cross-homology of 757, although much higher
homologies can be detected when smaller regions are compared (e.g. 927 between
2067
at Leiden University on A
ugust 17, 2011nar.oxfordjournals.org
Dow
nloaded from
Nucleic Acids Research
Table I. Sequence homologies between the monomers of LI.26 and LI.84, bothmonomers of the human a-dimer (ref.l), the consensus a-X monomer (ref.23),a monomer found on small polydisperse circular DNA (ref.24), and a-Y, amonomer derived from the human Y-chromosome (ref.ll). Numbers above thediagonal represent percent identity of the two compared sequences. Numbersbelow the diagonal represent mean cross-homology of the sequences that fallwithin the boxed region. Monomer numbers of LI.26 and LI.84 correspond tonumbering in fig.5.
LI.26/1
LI.26/2
LI.26/3
LI.26/4
LI.26/5
LI.84/1
LI.84/2
LI.84/3
LI.84/4
a-1
u-2
a-X
a-Y
SPC-1
1
-
2
71
-
75
73
76
80
71
72
LI.
3
72
80
-
+ 10
+ 3
+ 6
+ 5
+ 2
± 2
26
4
81
65
69
-
5
75
85
85
69
-
1
82
72
71
77
73
-
2
69
73
75
65
85
67
-
74
72
79
72
70
LI. 84
3
84
67
68
75
71
82
65
-
+ 3
+ 5
+ 3
+ 2
+ 3
4
72
80
82
66
90
71
82
68
-
a-1
76
73
70
73
75
77
71
75
73
-
a-2
75
76
77
69
82
73
68
71
77
73
-
a-X
82
77
81
75
85
80
76
79
82
78
84
-
a-Y
74
69
70
70
73
75
70
73
70
75
71
81
-
SPC-1
74
70
73
70
73
72
67
71
70
69
73
78
76
—
the last 100 bp of both sequences, not shown). Thus, after comparing overall
homologies, it appears that no specific tandem-sequence reported so far is
significantly more related to any of the others. This may be explained by the
scattered distribution of base substitutions among the fourteen compared
monomer sequences (fig.6B). About 70Z of all positions underwent two or less
base changes.
Within LI.84, the first full length monomer is 827 homologous to the third
and 67Z and 71Z to the second and fourth respectively. The second monomer
shares 82% homology to the fourth monomer. This distribution of homologies is
suggestive of a basic 340 bp horaology as proposed by Wu and Manuelidis (1) for
the consensus alphoid structure. In LI.26, however, the various cross-homolo-
gies do not show such a spacing pattern; the first full length monomer is 811
homologous to the fourth monomer, while the second monomer is 80Z homologous
to the third. These two units are also closely related to the fifth (not full
2068
at Leiden University on A
ugust 17, 2011nar.oxfordjournals.org
Dow
nloaded from
Nucleic Acids Research
length) monomer. If homologies of more than 80% are grouped, another kind of
spacing, represented by 'a-b-b-a-b' may be proposed.
3. Sequence-conservation.
Because of the scattered distribution of base substitutions, conserved
regions are not easily defined when all alphoid sequences are compared. Only
when more variable positions are first defined (dots in fig.6B), do some small
relatively more conserved regions become apparent. These include positions
3-13, 17-26, 42-51, 75-89, 95-111 and 140-148 and show a clustering of pos-
itions where no base change occurred at all.
DISCUSSION.
We have examined the genomic organization of two human alpha-satellite
DNA-sequences, designated LI.26 and LI.84. They were previously shown to map
predominantly to chromosomes 13, 21 and 18 respectively (12). Several lines of
evidence suggest that both sequences represent distinct subgroups of the
alphoid DNA family:
(a). Under our hybridization conditions, the copy-number of LI.26 and LI.84 is
about 2,000 per haploid genome each. Since the 340 bp EcoRI-fragment is
estimated to be present in about 55,000 copies (8), this indicates that each
probe detects a small subset of alphoid DNA sequences.
(b). Southern hybridizations to EcoRI-digested genomic DNA show that both
probes hybridize to the same series of 170 bp multimers, but each produces a
signal of different intensity per band. Further, LI.84 Is largely organized
into tetrameric units whereas LI.26 has a more complex organization,
(c). Sequence analysis of LI.26 and LI.84 shows that they each have diverged
about 257 from the 340 bp EcoRI-fragment reported by Wu and Manuelidis (1),
which is the reason for their poor hybridization to this band. LI.26 and LI.84
also show a 25% sequence divergence between one another.
It seems, therefore, that this family is a highly heterogeneous collection
of sequences, all variations on a 170 bp motif. The members diverge in
sequence composition, but remain clearly related (fig.6). Digestion with EcoRI
(or several other enzymes) results in a distribution of the heterogeneous
units among the bands that form the ladder. Consequently, each step in the
ladder consists of several DNA fragments of similar length, but with different
base sequences. According to our results, this variation may amount to 351
(Table I). Hybridization with a representative of a specific alphoid subfamily
under stringent conditions lights up only the most homologous multimers. Thus,
probe representatives of two different alphoid subfamilies may hybridize to
2069
at Leiden University on A
ugust 17, 2011nar.oxfordjournals.org
Dow
nloaded from
Nucleic Acids Research
the same band in the ladder, though not necessarily because of cross-homology
to each other, but due to comigration of different genomic alphoid sequences
to the same position. However, some hybridization may also occur from cross-
homology since closely related monomers are scattered throughout the various
tandem-structures (Table I).
Since some monomers of LI.26 are over 80% homologous to the X-chromosomal
consensus unit of Waye and Willard (23), it is not surprising to find LI.26
hybridizing to DNA of a hybrid cell line containing the X-chromosome as its
only retained human material. Although longer exposure times are needed (see
legend fig.4), the obtained restriction pattern closely resembles the reported
one obtained with a chromosome X-derived alphoid sequence (10). This suggests
that virtually all X-chromosomal alphoid DNA is organized in 2.0 kb BamHI
units as described, although we cannot fully exclude the presence of distinct-
ly organized divergent sequences left undetected by both probes. Further, we
showed that chromosome 13 specific alphoid DNA is distinct from the X-chromo-
some in its organization of restriction sites. Thus, the sequence-heterogenei-
ty within the alphoid family is distributed in a chromosome-specific manner
with a characteristic restriction site spacing for each enzyme and chromosome.
The speculation that these subfamilies play a role in discriminating chromo-
somes from one another (25,26), is therefore attractive.
The survival of alphoid DNA in the genome during evolution has led to
suggestions that it may serve in chromosome structure (25); nucleosome arr-
angement (27); and homologous chromosome recognition (reviewed in 28). As yet,
none of these alleged functions has been confirmed. Alternatively, it may be
an evolutionary "hitchhiker" with no special function (29). Sequence analysis
reveals some aspects of alphoid DNA evolution. We have found a 14 bp direct
repeat within LI.84 at the border of two units defined by Wu and Manuelidis.
This 14 bp repeat may be a remnant of an unequal cross-over event. A recently
proposed model (30) explains how short repeats or deletions of this type may
have originated. Within a Holliday-structure, mlspairing may occur, the result
of neighbouring sequence homology, or because of hairpin structures within a
single DNA strand. An Imperfect 14 bp stem-structure (AGAAACATCTTTGT at 46)
downstream of the 14 bp direct repeat may, by folding back, have been the
cause of the duplication. However, both investigated sequences are approx-
imately 25 bp out of phase compared to the 170 bp unit described by Wu and
Manuelidis (1). The question arises what are borders of the amplification unit
within LI.26 and LI.84 related sequences. It may be a sequence related to the
consensus 340 bp a-l/a-2 diner (1). This would explain the remarkable coinci-
2070
at Leiden University on A
ugust 17, 2011nar.oxfordjournals.org
Dow
nloaded from
Nucleic Acids Research
dence of the cross-over event in LI.84 with the a-l/a-2 unit-boundaries. It
would, however, not explain the tetrameric organization of LI.84 (fig.2),
which suggests that LI.84 as a whole is an amplification unit. It would also
be inconsistent with data of LI.26, which is suggestive of an a-b-b type of
suborganization, and those of Wave and Willard (23) who noticed a similar 79
bp out of phase phenomenon in their BamHI-defined 2.0 kb multimer (fig.6A),
but clearly demonstrated their sequence to be the amplification unit. In a
tandem array of 170 bp sequences, the start-point of any unit is, of course,
arbitrary. A unit-definition based on restriction sites is thus inappropriate.
Given the chromosome specific nature of the discussed sequences, it is reason-
able to propose that different chromosomes carry distinct amplification-units.
The out of phase phenomenon may be explained by the existence of extra-chromo-
somal circular satellite DNA (24). Formation and integration of these circles
through random homologous recombination events can explain both circular per-
mutations and conservation of the 170 bp unit.
Although many sequences within LI.26 and LI.84 resemble restriction sites
In that they contain one or two base changes relative to the true recognition
site, most restriction sites occur with an n(170) bp spacing (fig.5). Assuming
random mutation, this suggests that considerable homogeniration of sequences
is taking place continuously within the array, perhaps through gene conversion
or unequal crossing over, both meiotic and mitotlc (31).
Irrespective of the nature of the basic unit of alphoid DNA amplification,
the 170 bp regularity remains conserved. The existence of more highly conser-
ved regions within each monomer (fig.6) may either be a cause of the regular-
ity, or, alternatively, a consequence of it. Two of the conserved regions we
observed, notably positions 103-111 and 140-148, coincide with the binding
sites II and III of African Green Monkey alpha protein (32). It has been
suggested that "nucleosome phasing" may play a role in conservation of the 170
bp structure in alphoid DNA (27), although other data conflict with this
opinion (33). However, the observation that the alphoid sequences character-
ized to date demonstrate approximately 75Z homology to each other, irrespec-
tive of their chromosomal location, including sequences from the same chromo-
some, suggests a restriction in the degree of divergence permitted, which
remains unexplained at present.
Although we have shown in this study that certain alphoid sequences have
evolved in ways resulting in chromosome specific attributes, it is clear that
other chromosome specific alphoid sequences should be analyzed in order to
establish a general model of alphoid DNA evolution.
2071
at Leiden University on A
ugust 17, 2011nar.oxfordjournals.org
Dow
nloaded from
Nucleic Acids Research
ACKNOWLEDGEMENTS.
The authors would like to thank dr. A.M. Millington Ward and dr. G.-J.B.
van Ommen for helpful discussions and reviewing the manuscript and dr. F. Baas
and dr. H. van Ormondt for technical assistance during sequencing procedures
and computer analyses. This work was supported by the Netherlands Cancer
Foundation (Koningin Wilhelmina Fonds Grant nr. A83.21).
REFERENCES.
1. Wu, J.C. and Manuelidis, L. (1980) J. Mol. Biol. 142, 363-386.2. Shimizu, Y., Yoshida, K., Ren, Ch., Fujinaga, K., Rajagopalan and Chinna-
durai, G. (1983) Nature (Lond.) 302, 587-591.3. Shafit-Zagardo, B., Maio, J.J. and Brown, F.L. (1982) Nucl. Acids Res.
10, 3175-3193.4. Houck, CM., Rinehart, F.P. and Schmid, C.W. (1979) J. Mol. Biol. 132,
289-306.5. Maio, J.J., Brown, F.L. and Musich, P.R. (1981) Chromosoma (Berl.) 83,
103-125.6. Manuelidis, L. and Wu, J.C. (1978) Nature (Lond.) 276, 92-94.7. Manuelidis, L. (1978) Chromosoma (Berl.) 66, 1-21.8. Darling, S.M. , Crampton, J.M. and Williamson, R. (1982) J. Mol. Biol.
154, 51-63.9. Manuelidis, L. (1978) Chromosoma (Berl.) 66, 23-32.
10. Willard, H.F., Smith, K.D. and Sutherland, J. (1983) Nucl. Acids Res. 11,2017.
11. Wolfe, J., Darling, S.M., Erickson, R.P., Craig, I.W., Buckle, V.J.,Rigby, P.W.J., Willard, H.F. and Goodfellow, P.N. (1985) J. Mol. Biol.182, 477-485.
12. Devilee, P., Cremer, T., Slagboom, P., Bakker, E., Scholl, H.P., Hager,H.D., Stevenson, A.F.G., Cornelisse, C.J. and Pearson, P.L. Cytogen.Cell Gen., in press.
13. Hofker, M.H., Wapenaar, M.C., Goor, N., Bakker, B., Van Ommen, G.J.B. andPearson, P.L. (1985) Him. Genet. 70, 148-156.
14. Pearson, P.L., Bakker, E. and Flavell, R.A. (1982) Cytogen. Cell Gen. 32,308.
15. Maniatis, T., Fritsch, E.F. and Sambrook, J. (1982) Molecular Cloning, alaboratory manual. Cold Spring Harbor Laboratory.
16. Berbschleb-Voogt, E., Grzeschik, K.-H., Pearson, P.L. and Meera Khan, P.(1981) Hum. Genet. 59, 317-323.
17. Southern, E.M. (1975) J. Mol. Biol. 98, 503-517.18. Jeffreys, A.J. and Flavell, R.A. (1977) Cell 12, 429-439.19. Rigby, P.W.J. et al (1977) J. Mol. Biol. 113, 237-251.20. Sanger, F., Nicklen, S. and Coulson, A.R. (1977) Proc. Natl. Acad. Sci.
USA 74, 5463-5467.21. Messing, J. (1983) Meth. Enzymol. 101, 20-78.22. Biggin, M.D., Gibson, T.J. and Hong, G.F. (1983) Proc. Natl. Acad. Sci.
USA 80, 3963-3965.23. Waye, J.S. and Willard, H.F. (1985) Nucl. Acids Res. 3, 2731-2743.24. Jones, R.S. and Potter, S.S. (1985) Nucl. Acids Res. 13, 1027-1042.25. Manuelidis, L. (1982) In: Genome Evolution, eds. G.A. Dover and R.B.
Flavell, Academic Press, New York, p. 263-285.26. Lee, T.N.H. and Singer, M.F. (1982) J. Mol. Biol. 161, 323-342.
2072
at Leiden University on A
ugust 17, 2011nar.oxfordjournals.org
Dow
nloaded from
Nucleic Acids Research
27. Wu, K.C., Strauss, F. and Varshavsky, A. (1983) J. Mol. Biol. 170, 93-117.
28. Brutlag, D.L. (1980) Ann. Rev. Genet. 14, 121-144.29. Orgel, L.E., Crick, F.H.C. and Sapienza, C. (1980) Nature (Lond.) 288,
645-646.30. Millington Ward, A.M., Reuser, J.A.M., Scheele, J.Y., Van Lohuizen, E.J.,
Van Gorkum Van Diepen, I.R.M.C, Klasen, E.A. and Bresser, M. (1984) Mol.Gen. Genet. 193, 332-339.
31. Dover, G. (1982) Nature (Lond.) 299, 111-117.32. Strauss, F. and Varshavsky, A. (1984) Cell 37, 889-901.33. Smith, M.R. and Lieberman, M.H. (1984) Nucl. Acids Res. 12, 6493.
2073
at Leiden University on A
ugust 17, 2011nar.oxfordjournals.org
Dow
nloaded from
at Leiden University on A
ugust 17, 2011nar.oxfordjournals.org
Dow
nloaded from