Autosomal STR Variation in Five Austronesian Populations · Autosomal STR Variation in Five...

27
Autosomal STR Variation in Five Austronesian Populations E. M. SHEPARD, 1 R. A. CHOW, 1 EPIFANIA SUAFO’A, 2 DAVID ADDISON, 3 A. M. PE ´ REZ- MIRANDA, 1 R. L. GARCIA-BERTRAND, 4 AND R. J. HERRERA 1 Abstract Human population characteristics at the genetic level are inte- gral to both forensic biology and population genetics. This study evaluates biparental microsatellite markers in five Austronesian-speaking groups to characterize their intra- and interpopulation differences. Genetic diversity was analyzed using 15 short tandem repeat (STR) loci from 338 unrelated individuals from 5 Pacific islands populations, including the aboriginal Ami and Atayal groups from Taiwan, Bali and Java in Indonesia,and the Polyne- sian islands of Samoa. Allele frequencies from the STR profiles were deter- mined and compared to other geographically targeted worldwide populations procured from recent literature. Hierarchical AMOVA analysis revealed a large number of loci that exhibit significant correspondence to linguistic par- titioning among groups of populations. A pronounced divide exists between Samoa and the East (Formosa) and Southeast Asian (Bali and Java) islands. This is clearly illustrated in the topology of the neighbor-joining tree. Phylo- genetic analyses also indicate clear distinctions between the Ami and Atayal and between Java and Bali, which belie the respective geographic proximities of the populations in each set. This differentiation is supported by the higher interpopulation variance components of the Austronesian populations com- pared to other Asian non-Austronesian groups. Our phylogenetic data indi- cate that, despite their linguistic commonalities, these five groups are genetically distinct. This degree of genetic differentiation justifies the cre- ation of population-specific databases for human identification. Genetic diversity, characterized at both the intra- and the interpopulation level, forms the basis of forensic biology and population genetics, respectively. Yet these two disciplines are not always fully integrated in studies using human sam- ples. A well-established marker system such as autosomal short tandem repeats 1 Department of Biological Sciences, Florida International University, University Park, OE 304, Miami, FL 33199. 2 National Park Service, Pago Pago, American Samoa, 96799. 3 American Samoa Power Authority, Pago Pago, American Samoa, 96799. 4 Department of Biological Sciences, Colorado College, Colorado Springs, CO 80903. Human Biology, December 2005, v. 77, no. 6, pp. 825–851. Copyright 2005 Wayne State University Press, Detroit, Michigan 48201-1309 KEY WORDS: ISLANDSOUTHEAST ASIA, TAIWAN ABORIGINES,POLYNESIA, TAIWAN, BALI, JAVA, SAMOA, AUSTRONESIAN-SPEAKING GROUPS, AMI GROUP, ATAYAL GROUP, SHORT TANDEM REPEATS, MICROSATELLITE MARKERS, D8S1179, D21S11, D7S820, CSF1PO, D3S1358, TH01, D13S317, D16S539, VWA, TPOX, D18S51, D5S818, FGA, D2S1338, D19S433, PHYLOGENY, FORENSIC BIOLOGY, GENETIC DIVERSITY.

Transcript of Autosomal STR Variation in Five Austronesian Populations · Autosomal STR Variation in Five...

Page 1: Autosomal STR Variation in Five Austronesian Populations · Autosomal STR Variation in Five Austronesian Populations ... biogeographic information traced back at least two generations.

Autosomal STR Variation in Five Austronesian Populations

E. M. SHEPARD,1 R. A. CHOW,1 EPIFANIA SUAFO’A,2 DAVID ADDISON,3 A. M. PEREZ-MIRANDA,1 R. L. GARCIA-BERTRAND,4 AND R. J. HERRERA1

Abstract Human population characteristics at the genetic level are inte-gral to both forensic biology and population genetics. This study evaluatesbiparental microsatellite markers in five Austronesian-speaking groups tocharacterize their intra- and interpopulation differences. Genetic diversitywas analyzed using 15 short tandem repeat (STR) loci from 338 unrelatedindividuals from 5 Pacific islands populations, including the aboriginal Amiand Atayal groups from Taiwan, Bali and Java in Indonesia, and the Polyne-sian islands of Samoa. Allele frequencies from the STR profiles were deter-mined and compared to other geographically targeted worldwide populationsprocured from recent literature. Hierarchical AMOVA analysis revealed alarge number of loci that exhibit significant correspondence to linguistic par-titioning among groups of populations. A pronounced divide exists betweenSamoa and the East (Formosa) and Southeast Asian (Bali and Java) islands.This is clearly illustrated in the topology of the neighbor-joining tree. Phylo-genetic analyses also indicate clear distinctions between the Ami and Atayaland between Java and Bali, which belie the respective geographic proximitiesof the populations in each set. This differentiation is supported by the higherinterpopulation variance components of the Austronesian populations com-pared to other Asian non-Austronesian groups. Our phylogenetic data indi-cate that, despite their linguistic commonalities, these five groups aregenetically distinct. This degree of genetic differentiation justifies the cre-ation of population-specific databases for human identification.

Genetic diversity, characterized at both the intra- and the interpopulation level,forms the basis of forensic biology and population genetics, respectively. Yetthese two disciplines are not always fully integrated in studies using human sam-ples. A well-established marker system such as autosomal short tandem repeats

1Department of Biological Sciences, Florida International University, University Park, OE 304, Miami, FL33199.

2National Park Service, Pago Pago, American Samoa, 96799.3American Samoa Power Authority, Pago Pago, American Samoa, 96799.4Department of Biological Sciences, Colorado College, Colorado Springs, CO 80903.

Human Biology, December 2005, v. 77, no. 6, pp. 825–851.Copyright � 2005 Wayne State University Press, Detroit, Michigan 48201-1309

KEY WORDS: ISLAND SOUTHEAST ASIA, TAIWAN ABORIGINES, POLYNESIA, TAIWAN, BALI,JAVA, SAMOA, AUSTRONESIAN-SPEAKING GROUPS, AMI GROUP, ATAYAL GROUP, SHORT TANDEMREPEATS, MICROSATELLITE MARKERS, D8S1179, D21S11, D7S820, CSF1PO, D3S1358, TH01, D13S317,D16S539, VWA, TPOX, D18S51, D5S818, FGA, D2S1338, D19S433, PHYLOGENY, FORENSIC BIOLOGY,GENETIC DIVERSITY.

PAGE 825................. 15768$ $CH7 02-21-06 11:52:37 PS

Crissa Holder Smith
Muse_logo
Page 2: Autosomal STR Variation in Five Austronesian Populations · Autosomal STR Variation in Five Austronesian Populations ... biogeographic information traced back at least two generations.

826 / shepard et al.

(STRs) represents hypervariable regions that can provide the fine resolutionneeded to determine relationships among closely related populations in recentevolutionary history (Bowcock et al. 1994; Jorde et al. 1995, 1997; Bosch et al.2000; Lum et al. 2002; Rowold and Herrera 2003; Perez-Miranda et al. 2005;Shepard and Herrera 2005) and the discrimination power essential for robustindividual probabilities of inclusion (Leibelt et al. 2003; Collins et al. 2004).Also, STRs are used in these studies because of their numerous and relativelyeven distribution throughout the genome, high levels of polymorphism, a largenumber of possible alleles per locus, and short amplicon lengths, which facilitateDNA amplification, separation, and detection (Butler 2001; Butler et al. 2003).

The populations composing the Austronesian language family have beenthe subject of numerous studies from overlapping anthropological disciplines,namely, linguistics, archeology, and molecular biology. Studies in these fieldshave provided evidence on the complexity of human migration patterns duringthe Austronesian diaspora (Bellwood 2001; Underhill 2004). The current rangeof Austronesian-speaking people extends from Taiwan (Formosa) to the north,Easter Island (west of Chile, South America) in the east, New Zealand to thesouth, and as far as Madagascar (off East Africa) to the west. The distancesbetween these locations cover approximately two-thirds the circumference of theplanet. Consequently, because of their wide geographic distribution, the Aus-tronesians are an interesting group from the perspective of population genetics.In addition, because of their relatively recent expansions into the Indian andPacific Oceans, beginning about 6,000 years b.p. and ending as late as 800 yearsb.p., Austronesian populations provide an ideal test group to study a major dis-persal process from prehistoric time. In terms of forensics, this area is underchar-acterized by accepted STR marker sets.

The goal of this study is to investigate the allelic profiles of 15 biparentalSTR loci common to forensic studies (D8S1179, D21S11, D7S820, CSF1PO,D3S1358, TH01, D13S317, D16S539, VWA, TPOX, D18S51, D5S818, FGA,D2S1338, and D19S433) in five geographically targeted Austronesian popula-tions from the Pacific Ocean. These include two aboriginal Taiwanese popula-tions (the Ami and Atayal), two Indonesian populations from Bali and Java, anda Polynesian population from Samoa. The ultimate aim of our study is to assessthe degree of genetic heterogeneity among these five Austronesian populationsand to ascertain how they relate phylogenetically to regional and worldwidegroups previously studied with the same set of markers. In addition, these well-characterized databases will be of forensic value.

Upon examination of these 15 highly polymorphic loci, we find that whengroups of populations are compared, the overall tests of correlation between ge-netic partitioning with linguistic and geographic differences are statistically sig-nificant. Also, most loci exhibit significant correlation at the level of groups ofpopulations along geographic and linguistic lines. Phylogenetic analyses displaysome thought-provoking results, including an extreme differentiation between

PAGE 826................. 15768$ $CH7 02-21-06 11:52:37 PS

Page 3: Autosomal STR Variation in Five Austronesian Populations · Autosomal STR Variation in Five Austronesian Populations ... biogeographic information traced back at least two generations.

Autosomal STR Variation in Austronesians / 827

the two aboriginal populations from Taiwan, the Ami and Atayal, despite theiroverlapping geographic range. Similarly, we detect a clear distinction betweenthe Indonesian populations of Bali and Java, separated by mere miles within theIndo-Malaysian archipelago. Of particular interest is the segregation of Samoafrom the other four Austronesian groups into a different clade altogether. Basedon this evidence, we conclude that these five populations do not share an obviousgenetic link, despite their common language affiliation, the implications of whichare discussed further within a framework of autosomal STR analysis.

Materials and Methods

Populations, Sample Collection, and DNA Isolation. The five Austronesianpopulations from the Pacific Ocean investigated in this study include two aborigi-nal groups from Formosa (Taiwan) (the Ami and Atayal), two populations fromislands of the Indonesian chain (Bali and Java), and a fifth population from theSamoan islands in Polynesia (Figure 1). The Samoan samples were collectedfrom both Western and American Samoa as a representative group. Data from 12worldwide populations (Table 1) were obtained from the literature and used forcomparison. Populations were chosen from the literature to be representative ofdifferent ethnic groups and biogeographic areas. Individuals were identified bybiogeographic information traced back at least two generations. Each collectionwas arranged through the leaders of each region and supervised by the same.Sample collections were performed according to the ethical guidelines outlinedby the Institutional Review Board of Florida International University. All sam-ples were collected as whole blood in Vacutainer tubes containing EDTA. DNAwas extracted using the standard phenol-chloroform method (Antunez de Mayoloet al. 2002).

PCR Amplification and Detection of STRs. The samples were amplified byPCR using the commercial AmpFISTR Identifiler kit (Applied Biosystems, Fos-ter City, California) at the following loci: D8S1179, D21S11, D7S820, CSF1PO,D3S1358, TH01, D13S317, D16S539, D2S1338, D18S433, VWA, TPOX,D18S51, D5S818, FGA, and amelogenin. Amplifications were performed in aGeneAmp PCR System 9600 thermocycler (Applied Biosystems) using the fol-lowing cycling parameters: 11 min denaturation at 95�C; 28 cycles of 1 mindenaturation at 94�C, 1 min primer annealing at 59�C, and 1 min primer extensionat 72�C; and a final soak for 60 min at 60�C. A portion of each amplified samplewas mixed with formamide and GS500 LIZ as an internal size standard, as rec-ommended by the manufacturer (Applied Biosystems), and then separated usingan ABI Prism 3100 Genetic Analyzer (Applied Biosystems). GeneScan 3.7 wasused to determine the fragment sizes, and Genotyper 3.7 NT software was usedto designate alleles by comparison with the allelic ladder provided by the manu-facturer.

PAGE 827................. 15768$ $CH7 02-21-06 11:52:37 PS

Page 4: Autosomal STR Variation in Five Austronesian Populations · Autosomal STR Variation in Five Austronesian Populations ... biogeographic information traced back at least two generations.

828 / shepard et al.

Figure 1. Locations of the populations used in this study. Language affiliation classifications wereobtained from http://www.ethnologue.com. Geographic coordinates for each populationwere generated according to the geopolitical Mercator projection (Watkins et al. 2003).

Statistical and Phylogenetic Analyses. Allele frequencies of the 15 STR lociwere calculated using the gene counting method (Li 1976). The Arlequin soft-ware package, version 2.000 (Levene 1949; Guo and Thompson 1992; Schneideret al. 2000), was used to assess Hardy-Weinberg equilibrium expectations usingFisher’s exact test with the modified Markov chain Monte Carlo method as wellas to determine Nei’s gene diversity index (GD) (Nei 1987). Hardy-Weinbergequilibrium was evaluated at � � 0.05 and also using the Bonferroni adjustmentfor the number of loci tested (0.05/15 � 0.0033) as a correction for type I errors.

Forensically useful parameters were also examined for all five populationsstudied, including power of discrimination (PD) and polymorphic informationcontent (PIC), using the PowerStats program, version 1.2 (Tereba 1999; Jones1972; Brenner and Morris 1990). To determine phylogenetic relationships, the 5populations studied along with 12 other geographically targeted worldwide refer-ence populations were included in a neighbor-joining tree using Phylip 3.52csoftware (Felsenstein 2002) based on FST distances (Reynolds et al. 1983). Boot-strap consensus scores (1,000 replications) were generated by the SeqBoot and

PAGE 828................. 15768$ $CH7 02-21-06 11:53:05 PS

Page 5: Autosomal STR Variation in Five Austronesian Populations · Autosomal STR Variation in Five Austronesian Populations ... biogeographic information traced back at least two generations.

Autosomal STR Variation in Austronesians / 829

Table 1. Description of and Reference Information for the Studied Populations

Population n Description Reference

African Americans 258 General population of United States Butler et al. (2003)Ami 79 Aboriginal tribe, east-central Taiwan Present studyAngola 110 General population of Cabinda, Angola Beleza et al. (2004)Atayal 25 Aboriginal tribe, north Taiwan Present studyBali 79 General population of Bali Present studyBelgium 222 General population, majority from Decorte et al. (2004)

Flanders region of BelgiumJapan 526 General population of Japan Hashiyada et al. (2003)Java 60 General population of Java Present studyMalaysian Malay 210 Malay ethnicity from Malaysia Seah et al. (2003)Malaysian Chinese 219 Chinese ethnicity from Malaysia Seah et al. (2003)Mozambique 142 General population of Maputo, Mozambique Alves et al. (2004)North Poland 145 General population of northern Poland Szczerkowska et al.

(2004)Samoa 95 General population of American Samoa Present study

and SamoaTaiwan 597 General population of Taiwan Wang et al. (2003)U.S. Caucasians 302 General population of United States Butler et al. (2003)U.S. Hispanics 140 General population of United States Butler et al. (2003)Venezuela 255 General population of Caracas, Venezuela Chiurillo et al. (2003)

GenDist options of the Phylip software, and the ConSense programs determinedthe best-fitting dendrogram.

Multidimensional scaling (MDS) analyses were performed using the Statis-tical Package for the Social Sciences (SPSS) software program, which is alsobased on FST distances (SPSS Inc. 2001). G tests were carried out to determinedifferences in overall genetic variability between populations using Carmody’sprogram (Carmody 1990).

Inter- and intrapopulation genetic variance component values (GST and Hs,respectively) were ascertained for nine Pacific Ocean populations, composed ofsix Austronesian (Ami, Atayal, Bali, Java, Malaysian Malay, and Samoa) andthree non-Austronesian (Japan, Malaysian Chinese, and Taiwan) groups for eachlocus, according to the DISPAN software program (Ota 1993). Genetic structur-ing was analyzed for the same nine populations according to both biogeographiclines and linguistic subfamily affiliations through hierarchical analysis of molec-ular variance (AMOVA) (Excoffier et al. 1992). Linguistic correlations were as-sessed based on the following subfamily partitioning: two non-Austronesiangroups [Japanese (Japan) and Sino-Chinese (Taiwan and Malaysian Chinese)]and three Austronesian groups [Formosan (Ami, Atayal), Western Mayalo-Polynesian (Bali, Java, Malaysian Malay), and Eastern Mayalo-Polynesian(Samoa)]. Geographic correlations were tested on the basis of the following re-gional groups: northeast Asia (Japan), East Asia (Ami, Atayal, Taiwan and Ma-laysian Chinese), Southeast Asia (Bali, Java, Malaysian Malay), and Polynesia

PAGE 829................. 15768$ $CH7 02-21-06 11:53:06 PS

Page 6: Autosomal STR Variation in Five Austronesian Populations · Autosomal STR Variation in Five Austronesian Populations ... biogeographic information traced back at least two generations.

830 / shepard et al.

(Samoa). Note that the Malaysian Chinese population, while inhabiting Malaysia,was grouped with the East Asian populations because of its Han Chineseancestry.

Results

STR Diversity Within Populations. The allele-frequency distributions of theAmi, Atayal, Bali, Java, and Samoa populations are listed in Tables 2 through 6.In addition, parameters of importance to population genetics are summarized inTable 7 for each group under study. This table lists the loci in each populationthat do not meet Hardy-Weinberg equilibrium expectations at p � 0.05 (4 lociout of 75 possible tests). However, after applying the Bonferroni adjustment (seeexplanation in Materials and Methods section), this is reduced to a single depar-ture from Hardy-Weinberg equilibrium (1 loci out of 75 possible tests).

Although a detailed dissection of allele frequencies is not the purpose ofthis study, some interesting observations warrant attention. For instance, in theAmi population (Table 2), allele 10 of TPOX, commonly encountered in otherAsian databases, is notably absent from the Ami. Conversely, in the Ami, allele24.2 of FGA (0.0316) is present at a frequency up to 10 times higher than in thepublished Asian databases included in this study, and is absent altogether fromthe Atayal, Bali, Java, and Samoa. Observed heterozygosity (Ho) for the Amiranges from 0.6076 in TH01 to 0.8987 in D13S317 and D2S1338. In the Atayalthe smallest and largest allele sizes in each locus are often not present (Table 3).This population also lacks four common midsize alleles; most notably absent arealleles 9.3 and 10 of TH01 and alleles 15 and 24.2 of VWA. Observed heterozy-gosity for the Atayal ranges from 0.5200 in TH01 to 1.0000 in FGA. The Indone-sian population from Bali (Table 4) contains a microvariant allele (allele 23.2 ofD21S11) not encountered in any of the Asian databases used in our study. Therewas one deviation from Hardy-Weinberg equilibrium expectations for Bali inthe VWA locus ( p � 0.05) that persists even after application of the Bonferroniadjustment for type I errors. Departures have been reported in other regionalpopulations, specifically the Malay group from Malaysia for this particular locus(Seah et al. 2003). Observed heterozygosity of the Bali population oscillates from0.6076 in VWA to 0.8608 in D2S1338 and D18S51. In the other Indonesiangroup from Java (Table 5), allele 13 of TPOX (0.0083) is rarely detected in otherAsian groups. In this Javanese population the observed heterozygosity rangesfrom 0.6333 in CSF1PO to 0.9500 in D21S11. Within the population fromSamoa (Table 6), locus D13S317 contains a rare allele with 16 repeats, one ofthe largest in this locus reported thus far among the Asian data sets. Ho for Samoaranges from 0.6105 in TPOX to 0.9158 in D16S539.

PAGE 830................. 15768$ $CH7 02-21-06 11:53:06 PS

Page 7: Autosomal STR Variation in Five Austronesian Populations · Autosomal STR Variation in Five Austronesian Populations ... biogeographic information traced back at least two generations.

Autosomal STR Variation in Austronesians / 831

STR Diversity Among Populations. To investigate genetic affiliationsamong the five Pacific Ocean populations and their relationships to other world-wide populations, we generated a neighbor-joining tree using FST distances. Fig-ure 2 displays a neighbor-joining phylogram based on all 17 populations. Thereare three major clusters in the dendrogram (bootstrap value of 50%). One consistsof those of European/Hispanic descent; another primarily includes Africangroups or groups of African ancestry, and a third clade represents a cluster madeup of Asians and Pacific Ocean populations. Overall, the topology of this tree isrobust (only four nodes exhibit bootstrap values under 50% incidence).

The Atayal and Ami segregate into the same group as the Japanese andChinese from both Malaysia and Taiwan (91%) and then separate from thesethree populations with a confidence value of 29%. The Ami bifurcate from theAtayal with a bootstrap value of 40%. It is interesting to note that, althoughthere is a genetic relationship between these two Taiwanese aboriginal groups,the extensive branch length of the Atayal, with respect to the Ami, is indicativeof their genetic uniqueness. Both Bali and Java bifurcate from the MalaysianMalays (bootstrap value � 71%). However, the genetic distance between Baliand Java is not as pronounced as in the case with the Ami and Atayal. Samoasegregates in an isolated and intermediate position between the African and Euro-pean/Hispanic groups with a bootstrap value of 100%.

MDS analysis was performed to examine the genetic relationships amongthe 17 populations based on FST distances (Figure 3). Its topology is consistentwith that of the grouping in the neighbor-joining dendrogram. As in the neighbor-joining tree, there are three main clusters: (1) Europeans, (2) Africans and groupsof African descent, and (3) Asians and Pacific Islanders. The Ami, Atayal, Bali,and Java populations all cluster with the Asian groups on the left side of the plot.The Samoan population lies on the y axis of the crosshairs closer to the Africangroups than to the European/Hispanic groups. The isolated placement of the Sa-moans segregating away from all clusters and the extreme outlier position ofthe Atayal indicate considerable genetic differentiation with respect to the otherpopulations and mirrors the long branch distance observed in the neighbor-joining tree.

The nine Asian and Pacific Ocean populations were split into two groups:Austronesian and non-Austronesian language affiliation. They were then ana-lyzed to determine the allocation of genetic variance at the inter-GST and intra-Hs population levels (Table 8). The STR markers in the Austronesian groupsdisplayed lower levels of intrapopulation variance than the non-Austronesiangroups for 11 of the 15 loci and when calculated across all loci. GST values rangedfrom 2 to 10 times higher in the Austronesians than in the non-Autronesians perlocus and more than 8 times higher when assessed across all loci. When pairwiseG tests were performed on our 5 populations and on the 12 geographically tar-geted worldwide groups taken from the literature, we observed significant geneticdifferentiation ( p � 0.05) between all populations.

PAGE 831................. 15768$ $CH7 02-21-06 11:53:06 PS

Page 8: Autosomal STR Variation in Five Austronesian Populations · Autosomal STR Variation in Five Austronesian Populations ... biogeographic information traced back at least two generations.

832 / shepard et al.

Tab

le2.

STR

Alle

leFr

eque

ncie

sfo

rth

eA

mi(

Taiw

anes

eA

bori

gine

),n

�79

All

ele

D8S

1179

D21

S11

D7S

820

CSF

1PO

D3S

1358

TH

01D

13S3

17D

16S5

39D

2S13

38D

19S4

33V

WA

TP

OX

D18

S51

D5S

818

FG

A

60.

1076

70.

2342

0.05

068

0.15

190.

0570

0.31

650.

4747

90.

0570

0.52

530.

0886

0.31

010.

1139

0.04

439.

30.

0063

100.

1076

0.22

150.

2722

0.06

960.

1582

0.09

490.

3544

110.

0759

0.36

080.

2215

0.31

010.

2215

0.39

240.

2089

120.

1519

0.17

090.

4114

0.12

030.

2215

0.01

900.

0443

0.22

7813

0.25

950.

0253

0.08

230.

0063

0.12

660.

2848

0.02

530.

1139

13.2

0.01

9014

0.23

420.

0127

0.01

270.

0063

0.01

900.

1519

0.14

560.

1772

14.2

0.14

5615

0.11

390.

3291

0.00

630.

0696

0.05

700.

2658

15.2

0.29

7516

0.03

800.

3291

0.08

860.

0063

0.12

660.

0823

0.00

6316

.20.

0253

170.

0190

0.32

910.

0127

0.25

320.

1709

180.

0063

0.06

330.

2342

0.07

5919

0.15

820.

1582

0.08

860.

1076

200.

0443

0.01

900.

0253

0.06

9621

0.00

630.

0063

0.03

160.

1582

220.

0823

0.01

270.

2405

230.

2025

0.17

09

PAGE 832................. 15768$ $CH7 02-21-06 11:53:07 PS

Page 9: Autosomal STR Variation in Five Austronesian Populations · Autosomal STR Variation in Five Austronesian Populations ... biogeographic information traced back at least two generations.

Autosomal STR Variation in Austronesians / 833

240.

2658

0.10

1324

.20.

0316

250.

0759

0.08

8626

0.02

5328

0.03

1629

0.13

9230

0.19

6231

0.22

1531

.20.

0886

320.

0380

32.2

0.17

7233

0.01

2733

.20.

0633

34.2

0.03

16H

o0.

8481

0.78

480.

7595

0.68

350.

6835

0.60

760.

8987

0.78

480.

8987

0.77

220.

7975

0.65

820.

8228

0.77

220.

8734

He

0.82

790.

8516

0.77

050.

7052

0.70

770.

6706

0.76

110.

7852

0.84

210.

7852

0.82

040.

6112

0.84

930.

7688

0.85

71P

valu

e0.

3351

90.

7364

50.

1416

60.

9257

70.

0914

30.

4773

30.

2132

80.

5604

80.

2597

60.

5907

50.

0467

60.

9053

50.

0676

50.

9811

50.

2783

8G

D0.

8279

0.85

160.

7705

0.70

520.

6793

0.65

360.

7611

0.78

520.

8421

0.78

520.

8204

0.82

040.

8493

0.76

620.

8571

PD0.

9354

0.95

530.

9002

0.85

660.

8060

0.83

580.

8678

0.91

010.

9351

0.91

460.

9319

0.74

700.

9438

0.90

590.

9508

PIC

0.79

960.

8279

0.72

970.

6477

0.60

450.

6060

0.71

710.

7464

0.81

750.

7474

0.78

950.

5278

0.82

660.

7249

0.83

49

Ho,

obse

rved

hete

rozy

gosi

ty.

He,

expe

cted

hete

rozy

gosi

ty.

Pva

lue:

Har

dy-W

einb

erg

equi

libri

um,F

ishe

r’s

exac

ttes

t.G

D,g

ene

dive

rsity

inde

x.PD

,pow

erof

disc

rim

inat

ion.

PIC

,pol

ymor

phic

info

rmat

ion

cont

ent.

Stat

istic

sca

lcul

ated

usin

gPo

wer

Stat

s,v.

1.2

(Pro

meg

a).

PAGE 833................. 15768$ $CH7 02-21-06 11:53:07 PS

Page 10: Autosomal STR Variation in Five Austronesian Populations · Autosomal STR Variation in Five Austronesian Populations ... biogeographic information traced back at least two generations.

834 / shepard et al.

Tab

le3.

STR

Alle

leFr

eque

ncie

sfo

rth

eA

taya

l(T

aiw

anA

bori

gine

s),n

�25

All

ele

D8S

1179

D21

S11

D7S

820

CSF

1PO

D3S

1358

TH

01D

13S3

17D

16S5

39D

2S13

38D

19S4

33V

WA

TP

OX

D18

S51

D5S

818

FG

A

60.

2000

70.

6400

0.04

008

0.10

000.

0200

0.06

000.

3000

90.

0200

0.14

000.

0400

0.52

000.

0200

100.

3400

0.36

000.

2000

0.36

000.

1000

0.10

000.

3800

110.

0800

0.38

000.

3400

0.44

000.

1600

0.58

000.

4200

120.

1400

0.12

000.

3200

0.10

000.

1800

0.04

000.

1200

130.

1000

0.02

000.

1200

0.02

000.

2400

0.02

000.

0400

13.2

0.18

0014

0.24

000.

0200

0.02

000.

1200

0.16

000.

4400

14.2

0.10

0015

0.04

000.

3400

0.18

000.

2400

15.2

0.16

0016

0.06

000.

4600

0.08

000.

0800

0.20

0016

.20.

0200

170.

1600

0.20

000.

2200

0.06

0018

0.04

000.

0600

0.42

0019

0.18

000.

1200

0.06

0020

0.06

000.

1600

210.

0800

220.

0400

0.18

00

PAGE 834................. 15768$ $CH7 02-21-06 11:53:08 PS

Page 11: Autosomal STR Variation in Five Austronesian Populations · Autosomal STR Variation in Five Austronesian Populations ... biogeographic information traced back at least two generations.

Autosomal STR Variation in Austronesians / 835

230.

1600

0.14

0024

0.20

000.

0800

250.

0200

0.24

0026

0.04

0027

0.02

0028

0.02

0029

0.24

0030

0.28

0031

0.14

0031

.20.

2000

320.

0400

32.2

0.08

00H

o0.

8000

0.72

000.

6800

0.64

000.

7600

0.52

000.

7200

0.72

000.

8000

0.84

000.

7200

0.68

000.

6400

0.88

001.

0000

He

0.80

410.

8343

0.76

080.

7869

0.64

570.

5412

0.67

670.

6743

0.86

370.

8441

0.75

840.

5747

0.79

590.

6751

0.86

37P

valu

e0.

5575

40.

0895

90.

7334

10.

3203

40.

1141

80.

8749

00.

3131

40.

9340

50.

6644

60.

4712

00.

9021

80.

4138

00.

6372

20.

2310

10.

2671

2G

D0.

8016

0.81

220.

7151

0.74

200.

6457

0.54

120.

6751

0.67

430.

8637

0.84

410.

7437

0.57

470.

7176

0.67

510.

8637

PD0.

9024

0.89

920.

8640

0.86

720.

6848

0.72

960.

7808

0.84

480.

9376

0.91

840.

8832

0.70

080.

8608

0.72

000.

9056

PIC

0.75

690.

7659

0.64

920.

6784

0.56

480.

4796

0.60

150.

6207

0.82

780.

8037

0.68

960.

4938

0.65

750.

5993

0.82

83

Ho,o

bser

ved

hete

rozy

gosi

ty.

He,

expe

cted

hete

rozy

gosi

ty.

Pva

lue:

Har

dy-W

einb

erg

equi

libri

um,F

ishe

r’s

exac

ttes

t.G

D,g

ene

dive

rsity

inde

x.PD

,pow

erof

disc

rim

inat

ion.

PIC

,pol

ymor

phic

info

rmat

ion

cont

ent.

Stat

istic

sca

lcul

ated

usin

gPo

wer

Stat

s,v.

1.2

(Pro

meg

a).

PAGE 835................. 15768$ $CH7 02-21-06 11:53:08 PS

Page 12: Autosomal STR Variation in Five Austronesian Populations · Autosomal STR Variation in Five Austronesian Populations ... biogeographic information traced back at least two generations.

836 / shepard et al.

Tab

le4.

STR

Alle

leFr

eque

ncie

sfo

rB

ali(

Indo

nesi

a),n

�79

All

ele

D8S

1179

D21

S11

D7S

820

CSF

1PO

D3S

1358

TH

01D

13S3

17D

16S5

39D

2S13

38D

19S4

33V

WA

TP

OX

D18

S51

D5S

818

FG

A

60.

0570

70.

3165

80.

2278

0.13

920.

3101

0.00

630.

6456

90.

0633

0.05

060.

2911

0.13

290.

0949

0.13

290.

0063

9.3

0.10

1310

0.15

820.

1835

0.24

050.

0949

0.16

460.

1266

0.00

630.

3671

110.

0886

0.34

180.

2848

0.23

420.

3418

0.20

890.

2342

120.

1582

0.14

560.

3797

0.14

560.

2468

0.08

230.

0063

0.04

430.

3228

12.2

0.01

2713

0.17

720.

0380

0.04

430.

0127

0.15

820.

2025

0.10

130.

0696

13.2

0.08

2314

0.17

720.

0190

0.02

530.

2342

0.22

150.

1962

14.2

0.05

7015

0.13

920.

3734

0.15

820.

0063

0.29

7515

.20.

1519

160.

0823

0.29

110.

1456

0.22

7816

.20.

0190

170.

0190

0.27

220.

0886

0.32

280.

0443

180.

0380

0.06

330.

1456

0.01

270.

0063

190.

0063

0.28

480.

0633

0.01

900.

0633

200.

1835

0.06

960.

0380

0.13

9221

0.02

530.

0253

0.01

270.

0380

21.2

0.05

0622

0.12

030.

0063

0.23

4222

.20.

0253

230.

1392

0.20

8923

.20.

0063

0.01

27

PAGE 836................. 15768$ $CH7 02-21-06 11:53:08 PS

Page 13: Autosomal STR Variation in Five Austronesian Populations · Autosomal STR Variation in Five Austronesian Populations ... biogeographic information traced back at least two generations.

Autosomal STR Variation in Austronesians / 837

240.

0570

0.11

3925

0.03

800.

0443

25.2

0.00

6326

0.03

1627

0.01

9027

.20.

0063

28.2

0.04

4329

.20.

2595

30.2

0.21

5231

0.00

6331

.20.

0759

320.

0316

32.2

0.06

3333

0.18

3533

.20.

0253

340.

0823

350.

0063

Ho

0.84

810.

8228

0.72

150.

6709

0.63

290.

7595

0.72

150.

7215

0.86

080.

7975

0.60

760.

5823

0.86

080.

7089

0.83

54H

e0.

8592

0.84

500.

8104

0.74

520.

7045

0.77

810.

7951

0.77

300.

8395

0.84

780.

8357

0.52

520.

8099

0.75

190.

8624

Pva

lue

0.59

438

0.23

265

0.59

467

0.29

176

0.15

917

0.85

325

0.09

204

0.57

035

0.95

736

0.30

441

0.00

004

0.69

530

0.76

840

0.10

703

0.56

864

GD

0.85

820.

8387

0.77

130.

7154

0.70

430.

7781

0.78

790.

7730

0.83

950.

8441

0.79

990.

5252

0.80

990.

7058

0.86

24PD

0.95

270.

9415

0.90

950.

8627

0.85

370.

9133

0.91

040.

9088

0.94

860.

9463

0.92

130.

7162

0.92

360.

8342

0.95

72PI

C0.

8348

0.81

350.

7312

0.65

860.

6419

0.73

950.

7495

0.73

380.

8150

0.81

870.

7669

0.46

920.

7788

0.64

420.

8419

Ho,o

bser

ved

hete

rozy

gosi

ty.

He,

expe

cted

hete

rozy

gosi

ty.

Pva

lue:

Har

dy-W

einb

erg

equi

libri

um,F

ishe

r’s

exac

ttes

t.G

D,g

ene

dive

rsity

inde

x.PD

,pow

erof

disc

rim

inat

ion.

PIC

,pol

ymor

phic

info

rmat

ion

cont

ent.

Stat

istic

sca

lcul

ated

usin

gPo

wer

Stat

s,v.

1.2

(Pro

meg

a).

PAGE 837................. 15768$ $CH7 02-21-06 11:53:09 PS

Page 14: Autosomal STR Variation in Five Austronesian Populations · Autosomal STR Variation in Five Austronesian Populations ... biogeographic information traced back at least two generations.

838 / shepard et al.

Tab

le5.

STR

Alle

leFr

eque

ncie

sfo

rJa

va(I

ndon

esia

),n

�60

All

ele

D8S

1179

D21

S11

D7S

820

CSF

1PO

D3S

1358

TH

01D

13S3

17D

16S5

39D

2S13

38D

19S4

33V

WA

TP

OX

D18

S51

D5S

818

FG

A

60.

1250

0.00

837

0.00

830.

2667

0.00

838

0.24

170.

0083

0.11

670.

2917

0.50

839

0.05

000.

0167

0.28

330.

1083

0.14

170.

1333

0.00

839.

30.

0583

100.

0250

0.19

170.

1750

0.15

000.

2167

0.15

830.

0583

0.41

6711

0.10

830.

3250

0.40

000.

2417

0.31

670.

2667

0.20

0012

0.15

830.

1583

0.32

500.

0917

0.26

670.

0583

0.01

670.

0417

0.20

8313

0.12

500.

0250

0.07

500.

0250

0.03

330.

1167

0.28

330.

0083

0.00

830.

1083

0.14

1713

.20.

0583

140.

2417

0.05

830.

0167

0.19

170.

2917

0.25

000.

0083

14.2

0.04

1715

0.18

330.

2750

0.12

500.

0417

0.23

3315

.20.

2250

160.

1333

0.35

830.

0333

0.00

830.

1083

0.18

330.

0083

16.2

0.00

8317

0.02

500.

2167

0.10

000.

2167

0.08

3318

0.06

670.

0583

0.18

330.

0083

0.02

5019

0.23

330.

1250

0.03

330.

0583

200.

1000

0.01

670.

0250

0.07

5021

0.01

670.

0083

0.00

830.

1667

21.2

0.02

5022

0.04

170.

0250

0.24

1722

.20.

0417

230.

2333

0.15

8323

.20.

0083

240.

1250

0.07

50

PAGE 838................. 15768$ $CH7 02-21-06 11:53:09 PS

Page 15: Autosomal STR Variation in Five Austronesian Populations · Autosomal STR Variation in Five Austronesian Populations ... biogeographic information traced back at least two generations.

Autosomal STR Variation in Austronesians / 839

250.

0417

0.02

5025

.20.

0333

260.

0167

0.03

3326

.20.

0083

270.

0083

0.01

6728

0.06

670.

0083

290.

1917

300.

2583

30.2

0.04

1731

0.13

3331

.20.

0917

320.

0500

32.2

0.08

3333

0.01

6733

.20.

0167

340.

0250

34.2

0.01

67H

o0.

8333

0.95

000.

8167

0.63

330.

6833

0.81

670.

6833

0.66

670.

9000

0.70

000.

6500

0.68

330.

8333

0.70

000.

8000

He

0.84

620.

8573

0.78

140.

8024

0.76

950.

8020

0.81

160.

7763

0.85

410.

8149

0.86

190.

6564

0.85

410.

7468

0.87

07P

valu

e0.

8149

70.

2663

50.

8975

50.

3044

70.

4808

90.

2399

60.

1792

20.

1747

80.

4728

0.30

142

0.06

316

0.26

398

0.91

123

0.50

657

0.10

098

GD

0.84

620.

8573

0.78

140.

7106

0.74

680.

8001

0.79

470.

7763

0.85

410.

8148

0.81

200.

6396

0.83

070.

7287

0.87

03PD

0.94

170.

9367

0.90

560.

8639

0.88

890.

9139

0.92

110.

9061

0.94

610.

9350

0.92

830.

8039

0.93

830.

8761

0.95

44PI

C0.

8190

0.83

380.

7395

0.65

170.

6977

0.76

330.

7562

0.73

320.

8299

0.78

180.

7782

0.57

850.

8011

0.67

970.

8497

Ho,

obse

rved

hete

rozy

gosi

ty.

He,

expe

cted

hete

rozy

gosi

ty.

Pva

lue:

Har

dy-W

einb

erg

equi

libri

um,F

ishe

r’s

exac

ttes

t.G

D,g

ene

dive

rsity

inde

x.PD

,pow

erof

disc

rim

inat

ion.

PIC

,pol

ymor

phic

info

rmat

ion

cont

ent.

Stat

istic

sca

lcul

ated

usin

gPo

wer

Stat

s,v.

1.2

(Pro

meg

a).

PAGE 839................. 15768$ $CH7 02-21-06 11:53:09 PS

Page 16: Autosomal STR Variation in Five Austronesian Populations · Autosomal STR Variation in Five Austronesian Populations ... biogeographic information traced back at least two generations.

840 / shepard et al.

Tab

le6.

STR

Alle

leFr

eque

ncie

sfo

rSa

moa

,n�

95

All

ele

D8S

1179

D21

S11

D7S

820

CSF

1PO

D3S

1358

TH

01D

13S3

17D

16S5

39D

2S13

38D

19S4

33V

WA

TP

OX

D18

S51

D5S

818

FG

A

60.

1263

70.

4684

80.

1000

0.13

680.

0632

0.37

309

0.11

050.

0737

0.18

950.

2000

0.23

689.

30.

1526

100.

2105

0.19

470.

1789

0.04

210.

0947

0.16

840.

0368

0.22

1111

0.04

740.

1947

0.39

470.

3368

0.25

260.

3526

0.00

530.

1053

11.2

0.03

1612

0.01

580.

2150

0.34

740.

2105

0.15

790.

0211

0.27

3713

0.29

400.

0842

0.06

840.

0526

0.10

530.

2421

0.01

050.

3105

13.2

0.10

0014

0.23

160.

0947

0.01

050.

0105

0.02

630.

1158

0.20

530.

2263

0.08

950.

0684

14.2

0.15

7915

0.10

000.

0053

0.33

680.

0158

0.02

110.

2158

0.22

630.

0211

15.2

0.22

1116

0.07

370.

3780

0.01

050.

0053

0.10

000.

1000

170.

0211

0.20

530.

0789

0.29

470.

3010

180.

0053

0.05

260.

1158

0.11

580.

0579

190.

0158

0.22

630.

0421

0.12

630.

0632

200.

0158

0.00

530.

0579

0.01

0521

0.08

420.

0053

0.01

5822

0.25

260.

0053

0.03

6823

0.12

110.

2632

PAGE 840................. 15768$ $CH7 02-21-06 11:53:09 PS

Page 17: Autosomal STR Variation in Five Austronesian Populations · Autosomal STR Variation in Five Austronesian Populations ... biogeographic information traced back at least two generations.

Autosomal STR Variation in Austronesians / 841

240.

0789

0.33

6825

0.02

110.

0158

0.13

1626

0.10

5327

0.00

530.

0263

280.

2579

0.00

5329

0.26

320.

0053

300.

1368

310.

0842

31.2

0.06

3232

0.00

5332

.20.

1526

33.2

0.01

5834

.20.

0158

Ho

0.80

000.

7684

0.85

260.

7368

0.68

420.

6842

0.75

790.

9158

0.82

110.

7474

0.86

320.

6105

0.82

110.

7474

0.82

11H

e0.

8224

0.85

390.

8437

0.69

030.

7014

0.80

170.

7937

0.82

480.

8411

0.81

790.

7987

0.68

220.

8226

0.77

930.

7991

Pva

lue

0.90

599

0.26

757

0.53

662

0.28

805

0.52

546

0.32

775

0.79

872

0.28

200

0.57

929

0.01

297

0.60

521

0.23

593

0.66

662

0.37

434

0.01

979

GD

0.80

100.

8148

0.84

370.

6903

0.70

140.

7192

0.79

370.

8248

0.84

100.

8179

0.79

430.

6821

0.82

260.

7709

0.78

66PD

0.92

720.

9334

0.94

580.

8319

0.85

520.

8829

0.92

880.

9259

0.94

540.

9303

0.91

300.

8397

0.93

720.

8999

0.88

93PI

C0.

7680

0.78

490.

8189

0.62

840.

6415

0.68

370.

7615

0.79

280.

8171

0.78

710.

7587

0.61

330.

7956

0.72

520.

7532

Ho,o

bser

ved

hete

rozy

gosi

ty.

He,

expe

cted

hete

rozy

gosi

ty.

Pva

lue:

Har

dy-W

einb

erg

equi

libri

um,F

ishe

r’s

exac

ttes

t.G

D,g

ene

dive

rsity

inde

x.PD

,pow

erof

disc

rim

inat

ion.

PIC

,pol

ymor

phic

info

rmat

ion

cont

ent.

Stat

istic

sca

lcul

ated

usin

gPo

wer

Stat

s,v.

1.2

(Pro

meg

a).

PAGE 841................. 15768$ $CH7 02-21-06 11:53:10 PS

Page 18: Autosomal STR Variation in Five Austronesian Populations · Autosomal STR Variation in Five Austronesian Populations ... biogeographic information traced back at least two generations.

842 / shepard et al.

Tab

le7.

Sta

tist

ical

Popu

lati

onG

enet

icP

aram

eter

sof

Fiv

ePo

pula

tion

s

Pop

ulat

ion

Tota

lA

llel

es

Com

bine

dP

ower

ofD

iscr

imin

atio

nA

vera

geH

eter

ozyg

osit

y

Loc

iw

ith

Hig

hest

Pow

erof

Dis

crim

inat

ion

Loc

iw

ith

Low

est

Pow

erof

Dis

crim

inat

ion

Dep

artu

res

from

Har

dy-W

einb

erg

Equ

ilib

rium

Am

i11

10.

9999

9999

9999

999

0.77

10D

21S

11,D

18S

51,F

GA

TP

OX

VW

AA

taya

l89

0.99

9999

9999

9966

30.

7269

D2S

1338

D3S

1358

,TP

OX

Non

eB

ali

118

0.99

9999

9999

9999

90.

7746

D2S

11,D

2S13

38,D

19S

433,

FG

AT

PO

XV

WA

a

Java

129

0.99

9999

9999

9999

90.

7915

D8S

1179

,D2S

1338

,FG

AT

PO

XN

one

Sam

oa11

70.

9999

9999

9999

999

0.77

99D

7S82

0,D

2S13

38C

SF

1PO

,TP

OX

D19

S43

3,F

GA

a.Pe

rsis

tenc

eof

depa

rtur

efr

omH

ardy

-Wei

nber

geq

uili

briu

maf

ter

Bon

ferr

oni-

like

adju

stm

ent

for

num

ber

oflo

cite

sted

(0.0

5/15

�0.

0033

)

PAGE 842................. 15768$ $CH7 02-21-06 11:53:10 PS

Page 19: Autosomal STR Variation in Five Austronesian Populations · Autosomal STR Variation in Five Austronesian Populations ... biogeographic information traced back at least two generations.

Autosomal STR Variation in Austronesians / 843

Figure 2. Neighbor-joining phylogenic analyses of 17 worldwide populations based on FST dis-tances from STR allele frequencies. The GenDist option of the Phylip software createdbranch distances onto which the corresponding bootstrap values (based on 1,000 repli-cations) were transferred to the corresponding nodes of the neighbor-joining tree.

Partitioning of Populations Based on Geography and Language. The dis-tribution of genetic variance was assessed along geographic and linguistic parti-tioning among the Asian and Pacific Ocean populations using AMOVA. Thesix Austronesian populations, five from this study and one previously studied(Malaysian Malay), and three non-Austronesian reference populations from Asia(Japan, Taiwan general population, and Malaysian Chinese) were included in thisanalysis. Table 9 indicates the loci that exhibit statistically significant correlationsand their corresponding variance values. Except for marker D19S433, all locidemonstrate no significant correlation ( p � 0.05) between genetic diversity andlinguistic or geographic partitions when populations within groups are compared.In contrast to the lack of significance among populations within groups, the fol-lowing five loci overlapped, showing significant correlation ( p � 0.05) betweengenetic diversity among groups of populations to both linguistics and geography:D8S1179, D7S820, TH01, D16S539, and D2S1338. Genetic diversity in fiveadditional loci exhibited significant correlation ( p � 0.05) among groups of pop-ulations along linguistic lines for 10 out of 15 significant loci. A single additionallocus showed significant correlation ( p � 0.05) between genetic differences andgeographic partitioning among groups of populations for 6 of the 15 loci. Theoverall AMOVA among groups of populations along linguistic ( p � 0.00001)

PAGE 843................. 15768$ $CH7 02-21-06 11:53:31 PS

Page 20: Autosomal STR Variation in Five Austronesian Populations · Autosomal STR Variation in Five Austronesian Populations ... biogeographic information traced back at least two generations.

844 / shepard et al.

Figure 3. Multidimensional scaling analyses of 17 worldwide populations based on FST distancesfrom STR allele frequencies. AFA, African American; AMI, Ami; ATA, Atayal; BAL,Bali; BEL, Belgium; CAB, Cabinda, Angola; JAP, Japan; JAV, Java; MCH, MalaysianChinese; MML, Malaysian Malay; MOZ, Mozambique; NPO, North Poland; SAM,Samoa; TAI, Taiwan; USC, US Caucasian; USH, US Hispanic; VEN, Venezuela.

and geographic ( p � 0.01) lines generated significant correspondence to geneticstructure. Yet the among-populations within-groups overall AMOVA gave insig-nificant correlation along linguistic and geographic lines.

Discussion

These results provide novel databases for 15 autosomal STR loci in fourAustronesian populations (Ami, Atayal, Bali, and Samoa). For a fifth Austrone-sian group (Java) new data was added in the form of loci D2S1338 and D19S433to a preexisting database (Othman et al. 2004). An inspection of the results re-veals that there is a marked lack of commonly encountered microvariant allelesamong the Ami, Atayal, and Samoan groups in the highly polymorphic lociD21S11 and FGA. We also detected specific alleles in common among the pub-lished Asian and Pacific Ocean population databases used in this study as wellas in our Balineses and Javanese groups. These include alleles 28.2, 29.2, and30.2 in the D21S11 locus and alleles 21.2, 22.2, 23.2, and 25.2 at the FGAlocus. Overall, D2S1338 and FGA are the most discriminating loci across allpopulations, and TPOX is the least discriminating locus; D2S1338 and FGA havethe highest number of alleles, and TPOX has the lowest number of alleles within

PAGE 844................. 15768$ $CH7 02-21-06 11:53:45 PS

Page 21: Autosomal STR Variation in Five Austronesian Populations · Autosomal STR Variation in Five Austronesian Populations ... biogeographic information traced back at least two generations.

Autosomal STR Variation in Austronesians / 845

Table 8. Components of Genetic Variance for Nine Austronesian and Asian Non-Austronesian Populations

Intrapopulation Hs Interpopulation GST

Locus Austronesians Non-Austronesians Austronesians Non-Austronesians

8S1179 0.823938 0.841298 0.026409 0.001787D21S11 0.830586 0.803820 0.023381 0.001264D7S820 0.772001 0.765179 0.018428 0.002064CSF1PO 0.706132 0.736505 0.010057 0.002924D3S1358 0.700171 0.719401 0.011240 0.003285TH01t 0.704140 0.692647 0.064438 0.008190D13S317 0.764493 0.798947 0.036658 0.001299D16S539 0.765375 0.779884 0.037288 0.007956D2S1338 0.845132 0.866158 0.026127 0.004573D19S433 0.817751 0.797587 0.015838 0.007098VWA 0.790375 0.794881 0.026988 0.002716TPOX 0.602920 0.613072 0.051971 0.009099D18S51 0.806176 0.862127 0.033837 0.001645D5S818 0.731385 0.792780 0.034100 0.002064FGA 0.845817 0.863177 0.033127 0.001464All loci 0.767093 0.781831 0.029760 0.003691

each population. As expected in most cases, the most polymorphic loci (highestHo value) are the most discriminating markers (highest PD value) for each popu-lation.

Although four of the five populations under study (Ami, Atayal, Bali, andJava) clustered within the Asian/Pacific Ocean clade of the neighbor-joining tree(see Figure 2), the fact that the Polynesian group from Samoa segregates withinthe African groups is most notable. It is likely that genetic drift resulting frommultiple bottleneck events, founder effects, and/or isolation have contributed toa genetic makeup for this Samoan population that does not reflect its ancestry.This rather unexpected result underscores the need to examine the genetic pro-files of individual Pacific islands and not to consider them interchangeable forforensic analysis.

A second observation is the larger branch lengths of the Atayal and Samoanpopulations. These indicate a large degree of genetic differentiation in these twogroups, possibly because of migratory bottlenecks, founder effects, many genera-tions of relative isolation, and/or genetic drift. The relative positions of these twopopulations in the MDS analysis (see Figure 3), most notably that of the Atayal,corroborates this notion. As mentioned previously, it is interesting that the Atayaland Samoa are two of the three groups that lack microvariant alleles.

Another important observation is that the three Indo-Malaysian islandgroups in this study (Bali, Java, and Malay Malaysia) segregate in a differentbranch of the clade distant from the two Taiwanese aboriginal groups, the Ami

PAGE 845................. 15768$ $CH7 02-21-06 11:53:45 PS

Page 22: Autosomal STR Variation in Five Austronesian Populations · Autosomal STR Variation in Five Austronesian Populations ... biogeographic information traced back at least two generations.

846 / shepard et al.

Table 9. Significant AMOVA Values for Nine Austronesian and Asian Non-Austrone-sian Populationsa

Linguistic Partitioning Geographic Partitioning

Among Among Among AmongGroups of Populations Within Groups of Populations WithinPopulations Groups Populations Groups

D8S1179 (0.37) D19S433 (3.17) D8S1179 (0.38) D19S433 (2.88)D21S11 (0.89) D7S820 (1.01)D7S820 (1.21) CSF1PO (0.99)TH01 (2.12) TH01 (2.61)D16S539 (1.70) D16S539 (1.64)D2S1338 (1.10) D2S1338 (0.98)VWA (0.50)TPOX (1.69)D18S51 (1.05)D5S818 (1.39)

a. Numbers in parentheses refer to percentage of variance, considered significant when p � 0.05.

and Atayal. From these results it appears that the Austronesian language affilia-tion of these five groups is not reflected in their present genetic relationship.Analysis of variance components (see Table 8) revealed two related points: (1)The level of interpopulation differentiation among our Austronesian populationsis higher than in the Asian non-Austronesian groups, and (2) conversely, theintrapopulation variance is lower in the Austronesians than in the Asian non-Austronesians for most loci. It would be expected that during geographic andcultural isolation, forces such as founder effects, inbreeding, and limited geneflow would mitigate within-population variance while acting to augment inter-population differences.

An examination of the AMOVA results (see Table 9) for nine Austronesianand Asian non-Austronesian populations indicates that most of the loci exhibitsignificant genetic partitioning along linguistic and geographic lines amonggroups of populations. On the other hand, only one locus, D19S433, showedsignificant correlations with both language and geography among populationswithin groups. It is likely that high levels of polymorphism and differences inallele frequencies particular to this locus among the nine populations provide thefine resolution necessary to detect genetic variability at the among-populationswithin-groups level.

Similarly, overall statistically significant correlation along linguistic andgeographic partitioning was detected only among groups of populations. It isexpected that a greater number of loci will generate significant correlations withlinguistic and geographic partitioning among groups of populations rather thanamong populations within groups because genetic differences are generally

PAGE 846................. 15768$ $CH7 02-21-06 11:53:46 PS

Page 23: Autosomal STR Variation in Five Austronesian Populations · Autosomal STR Variation in Five Austronesian Populations ... biogeographic information traced back at least two generations.

Autosomal STR Variation in Austronesians / 847

greater at the among-populations level. It is worth mentioning that a greater num-ber of loci display significant correspondence between genetics and linguisticscompared to genetic and geographic partitioning at the among groups of popula-tions level. These results indicate that division based on language groups is inbetter agreement with the genetic structure of these nine Austronesian and Asiannon-Austronesian populations.

Focusing on a smaller geographic scale, the aboriginal populations of For-mosa provide a unique opportunity to dissect the effects of social and culturalrelationships on the genetic makeup of neighboring populations. The Ami andAtayal groups represent two of the nine extant indigenous tribes of Taiwan. TheAmi inhabit the narrow eastern seacoast plains of the island and represent thelargest tribal group, approximately 130,000 in number. The Atayal are the secondlargest tribe (about 90,000) and reside adjacent to the Ami in the mountainousterrain of northern Taiwan. Historical accounts cite continuous waves of migrantsfrom the Asian mainland who displaced the indigenous tribes and forced theminto the less accessible areas of the island, hence leading to their current distribu-tion (Knapp 1980).

A number of previous studies using mitochondrial DNA (Lum et al. 1994;Redd et al. 1995; Melton et al. 1995, 1998) found that the Ami and Atayal aborig-inal groups have a large amount of mtDNA sequence homology, suggesting acommon ancestral source in central or southern China. However, the investigatorsalso reported evidence of a prolonged isolation from mainland Chinese and otherAsian influences in the recent past. Studies based on autosomal (Sewerin et al.2002; Chow et al. 2005) and mtDNA (Horai et al. 1995) markers have also dem-onstrated genetic uniqueness among the indigenous groups, implying varyingtemporal and/or spatial sources for the initial colonization of Formosa. One pater-nal lineage study found that the Ami stand out from the other aboriginal groupsbecause of their closer genetic association with both South China and the Philip-pines. This is illustrated by the Ami’s high frequency of haplogroup L in contrastto the other four aboriginal groups, which lack this haplogroup altogether.

The distinctness of aboriginal groups from each other is especially evidentin the extremely homogeneous Atayal, whose Y chromosomes are almost entirelyof a single haplogroup (haplogroup H) (Capelli et al. 2001). This theme is echoedin older studies using classical markers (Cavalli-Sforza et al. 1994). In the presentstudy both the Ami and the Atayal separate from the other populations from Asiaand the Pacific Ocean in both the dendrogram and the MDS plot. In turn, theextreme branch distance of the Ami and, in particular, the Atayal is consistentwith the intertribal group differentiation described in the literature. Although theranges of these two tribes share a border on the northeast of the island, it is likelythat regional geographic, cultural, and linguistic barriers played a role in thegenetic differentiation of these two Taiwanese aboriginal groups. These findingssupport previous genetic studies with regard to differences between tribes.

PAGE 847................. 15768$ $CH7 02-21-06 11:53:46 PS

Page 24: Autosomal STR Variation in Five Austronesian Populations · Autosomal STR Variation in Five Austronesian Populations ... biogeographic information traced back at least two generations.

848 / shepard et al.

Similar to the Taiwanese tribal groups, the populations of the islands ofBali and Java lie in close proximity within the Indo-Malaysian archipelago, sepa-rated by mere miles. In a recent study of the general Javanese population (Oth-man et al. 2004) that reported a battery of autosomal STRs that coincide with 13of the loci reported in the present work, we found no marked differences in allelefrequencies compared to our data. Both Bali and Java belong to the same WesternMalayo-Polynesian subgroup of the Austronesian language family and, not sur-prisingly, segregate together within our neighbor-joining tree and MDS analysis.Also, as could be anticipated, the Malaysian Malay population clusters togetherin the same subclade with the Bali and the Java groups. Contrary to expectationsbased on geography alone, the two populations are clearly more distinct fromeach other than even the population of Han Chinese is from the Japanese. In theMDS analysis the Javanese group plots closer to the Han Chinese from Malaysiaand Taiwan than to its nearest neighbors from Bali. This may be a reflection ofadmixture of the Javanese population with an influx of Muslim, Indian, and main-land Chinese in the recent past. In historical times Java has been subject to wavesof Buddhist, Hindu, and Muslim migrations, generating cultural diversity andpossibly genetic heterogeneity. In addition, Java is more than 23 times larger inarea than Bali, and it is possible that this sheer difference in area allows a largereffective population size. On the other hand, Bali has remained relatively isolatedboth culturally and genetically.

Samoa, the most distant population from the Asian mainland examined inthe present study, segregates as an outlier in both the dendrogram and MDSanalyses. These islands lie near the boundary dividing Polynesia from both Mel-anesia to the west and Micronesia to the northwest and therefore represent apopulation at the crossroads of the Pacific Ocean. Our phylogenetic analyses ofthis Polynesian group indicate no apparent genetic relationship with the otherAustronesian populations. In fact, the Samoans segregate at an intermediate posi-tion within the African cluster in the neighbor-joining tree and plot closer to theAfrican groups in the MDS analysis. The relatively long branch length in thephylogram and isolation in the MDS analysis is indicative of the genetic unique-ness of this particular group compared to the other populations examined in thisstudy.

In a previous work using five Y-chromosome STRs, similar results werenoted in a Western Samoan population and were attributed to unique allele distri-butions compared to other Asian and Pacific Ocean groups (Parra et al. 1999).Parra and colleagues (1999) suggested founder effects stemming from the initialAustronesian settlement of Oceania as a likely cause. It is possible that thepointed genetic distinctiveness of the Samoans that provides for their segregationwithin the African cluster is the result of extreme genetic drift.

It is well documented that early Polynesians had reached as far as Samoaby means of the northern islands of Tonga by about 3,000 years b.p. Here, thePolynesian language evolved into two different subgroups: the Tongic and Nu-clear Polynesian subgroups. The Nuclear Polynesian subgroup contains the East-ern Polynesian and Samoan Outlier languages. Archeological and linguistic data

PAGE 848................. 15768$ $CH7 02-21-06 11:53:47 PS

Page 25: Autosomal STR Variation in Five Austronesian Populations · Autosomal STR Variation in Five Austronesian Populations ... biogeographic information traced back at least two generations.

Autosomal STR Variation in Austronesians / 849

suggest that further colonization of the eastern islands of Polynesia occurred viaSamoa within the last 2,000 to 800 years before present (Bellwood 1978; Kirch1997). This recent 1,000-year layover in Samoa may have led to the severe ge-netic distinctness that is observed in both the dendrogram and the MDS plot andparalleled by the documented linguistic subdifferentiation.

Our results also could be interpreted as indicating a considerable geneticcontribution to the Samoan gene pool from another source, such as neighboringMelanesia. In this scenario, depending on the proportion of a Papuan geneticcomponent, the Samoans would be expected to appear as genetically distinctfrom the other groups examined in this study. This possibility is supported bythe findings of a previous study in which biparental STRs displayed a patternconsistent with an initial Austronesian expansion into Remote Oceania fromSoutheast Asia followed by a significant amount of gene flow from Near Oceania(Lum et al. 2002). However, although our phylogenetic data on the Samoan popu-lation may be thought-provoking, inclusion of additional relevant Pacific Oceanpopulations, including the Papuans, needs to be studied in order to examine theseissues.

Acknowledgments We gratefully acknowledge Laisel Martinez, Diane J. Rowold, andMaria Christina Terreros for their constructive criticism of the manuscript.

Received 13 May 2005; revision received 5 September 2005.

Literature Cited

Alves, C., L. Gusmao, A. Damasceno et al. 2004. Contribution for an African autosomal STR data-base (AmpFISTR Identifiler and Powerplex 16 system) and a report on genotypic variations.Forensic Sci. Int. 139:201–205.

Antunez de Mayolo, G., A. Antunez de Mayolo, P. Antunez de Mayolo et al. 2002. Phylogenetics ofworldwide human populations as determined by polymorphic Alu insertions. Electrophoresis23:3346–3356.

Beleza, S., C. Alves, F. Reis et al. 2004. 17 STR (AmpFISTR Identifiler and Powerplex 16 system)from Cabinda (Angola). Forensic Sci. Int. 141:193–196.

Bellwood, P. 1978. The Polynesians. London: Thames and Hudson.Bellwood, P. 2001. Early agriculturalist population diasporas? Farming, languages and genes. Annu.

Rev. Anthropol. 30:181–207.Bosch, E., F. Calafell, A. Perez-Lezaun et al. 2000. Genetic structure of northwest Africa revealed by

STR analysis. Eur. J. Hum. Genet. 8:360–366.Bowcock, A., A. Ruiz-Linares, J. Tomfohrde et al. 1994. High resolution of human evolutionary trees

with polymorphic microsatellites. Nature 368:455–457.Brenner, C., and J. Morris. 1990. Paternity index calculations in single locus hypervariable DNA

probes: Validation and other studies. In Proceedings for the International Symposium onHuman Identification 1989. Madison, WI: Promega, 21–53.

Butler, J. M. 2001. Forensic DNA Typing: Biology and Technology Behind STR Markers. London:Academic Press.

PAGE 849................. 15768$ $CH7 02-21-06 11:53:47 PS

Page 26: Autosomal STR Variation in Five Austronesian Populations · Autosomal STR Variation in Five Austronesian Populations ... biogeographic information traced back at least two generations.

850 / shepard et al.

Butler, J. M., R. Schoske, P. M. Vallone et al. 2003. Allele frequencies for 15 autosomal STR loci onU.S. Caucasian, African American, and Hispanic populations. J. Forensic Sci. 48(4):908–911.

Capelli, C., J. F. Wilson, M. Richards et al. 2001. A predominantly indigenous paternal heritage forthe Austronesian-speaking peoples of insular Southeast Asia and Oceania. Am. J. Hum. Genet.68:432–443.

Carmody, G. 1990. G-Test. Ottawa, Canada: Carleton University.Cavalli-Sforza, L. L., P. Menozzi, and A. Piazza. 1994. The History and Geography of Human Genes.

Princeton, NJ: Princeton University Press.Chiurillo, M. A., A. Morales, A. M. Mendes et al. 2003. Genetic profiling of a central Venezuelan

population using 15 STR markers that may be of forensic importance. Forensic Sci. Int.136:99–101.

Chow, R. A., J. L. Caeiro, S. J. Chen et al. 2005. Genetic characterization of four Austronesian-speaking populations. J. Hum. Genet. (in press).

Collins, P. J., L. K. Hennessy, C. S. Leibelt et al. 2004. Developmental validation of a single-tubeamplification of the 13 CODIS STR loci, D2S1338, D19S433 and amelogenin: The Amp-FISTR Identifiler PCR Amplification kit. J. Forensic Sci. 49(6):1265–1277.

Decorte, R., M. Engelen, L. Larno et al. 2004. Belgian population data for 15 STR loci (AmpFISTRSGM Plus and AmpFISTR Profiler PCR amplification kit). Forensic Sci. Int. 139:211–213.

Excoffier, L., P. E. Smouse, and J. M. Quattro. 1992. Analysis of molecular variance inferred frommetric distances among DNA haplotypes: Application to human mitochondrial DNA restric-tion data. Genetics 131:479–491.

Felsenstein, J. 2002. Phylogeny Inference Package (PHYLIP), Version 3.6a3. Distributed by author.Seattle: Department of Genetics, University of Washington.

Guo, S., and E. Thompson. 1992. Performing the exact test of Hardy-Weinberg proportion for multi-ple alleles. Biometrics 48:361–372.

Hashiyada, M., Y. Itakura, T. Nagashima et al. 2003. Polymorphism of 17 STRs by multiplex analysisin Japanese population. Forensic Sci. Int. 133:250–253.

Horai, S., K. Hayasaka, R. Kondo et al. 1995. Recent African origin of modern humans revealed bycomplete sequences of hominoid mitochondrial DNAs. Proc. Natl. Acad. Sci. USA 92:532–536.

Jones, D. A. 1972. Blood samples: Probability of discrimination. J. Forensic Sci. Soc. 12:355–359.Jorde, L. B., M. J. Bamshad, W. S. Watkins et al. 1995. Origins and affinities of modern humans: A

comparison of mitochondrial and nuclear genetic data. Am. J. Hum. Genet. 57:523–538.Jorde, L. B., A. R. Rogers, M. Bamshad et al. 1997. Microsatellite diversity and the demographic

history of modern humans. Proc. Natl. Acad. Sci. USA 94:3100–3103.Kirch, P. V. 1997. The Lapita Peoples: Ancestors of the Oceanic World. Cambridge, MA: Blackwell.Knapp, R. G. 1980. China’s Final Frontier: Studies in the Historical Geography of Taiwan. Taipei:

SMC Publishing.Leibelt, C., B. Budowle, P. Collins et al. 2003. Identification of a D8S1179 primer binding site

mutation and the validation of a primer designed to recover null alleles. Forensic Sci. Int.133(3):220–227.

Levene, H. 1949. On a matching problem arising in genetics. Ann. Math. Stat. 20:91–94.Li, C. C. 1976. First Course in Population Genetics. Pacific Grove, CA: Boxwood Press.Lum, J. K., L. B. Jorde, and W. Schiefenhovel. 2002. Affinities among Melanesians, Micronesians,

and Polynesians: A neutral biparental genetic perspective. Hum. Biol. 74(3):413–430.Lum, J. K., O. Rickards, C. Ching et al. 1994. Polynesian mitochondrial DNAs reveal three deep

maternal lineage clusters. Hum. Biol. 66:567–590.Melton, T., S. Clifford, J. J. Martinson et al. 1998. Genetic evidence for the Proto-Austronesian

homeland in Asia: mtDNA and nuclear DNA variation in Taiwanese aboriginal tribes. Am. J.Hum. Genet. 63:1807–1823.

Melton, T., R. Peterson, A. J. Redd et al. 1995. Polynesian genetic affinities with Southeast Asianpopulations as identified by mtDNA analysis. Am. J. Hum. Genet. 57:403–414.

PAGE 850................. 15768$ $CH7 02-21-06 11:53:48 PS

Page 27: Autosomal STR Variation in Five Austronesian Populations · Autosomal STR Variation in Five Austronesian Populations ... biogeographic information traced back at least two generations.

Autosomal STR Variation in Austronesians / 851

Nei, M. 1987. Molecular Evolutionary Genetics. New York: Columbia University Press.Ota, T. 1993. DISPAN: Genetic Distance and Phylogenetic Analysis. University Park, PA: Institute

of Molecular Evolutionary Genetics, Pennsylvania State University.Othman, M. I., L. H. Seah, S. Panneerchelvan et al. 2004. Allele frequencies for the PowerPlex 16

STR loci in Javanese population from Malaysia. J. Forensic Sci. 149(1):190–191.Parra, E., M. D. Shriver, A. Soemantri et al. 1999. Analysis of five Y-specific microsatellite loci in

the Asian and Pacific populations. Am. J. Phys. Anthropol. 110(1):1–16.Perez-Miranda, A. M., M. A. Alfonso-Sanchez, A. Kalantar et al. 2005. Allelic frequencies of 13

STR loci in autochthonous Basques from the province of Vizcaya (Spain). Forensic Sci. Int.152(2–3):259–262.

Redd, A. J., N. Takezaki, S. T. Sherry et al. 1995. Evolutionary history of the COII/tRNALys inter-genic 9 base pair deletion in human mitochondrial DNAs from the Pacific. Mol. Biol. Evol.12:604–615.

Reynolds, J., B. S. Weir, and C. C. Cockerham. 1983. Estimation of the coancestry coefficient: Basisfor a short term genetic distance. Genetics 105:767–779.

Rowold, D. J., and R. J. Herrera. 2003. Inferring recent human phylogenies using forensic STRtechnology. Forensic Sci. Int. 133:260–265.

Schneider, S., J.-M. Kueffer, D. Roessli et al. 2000. Arlequin v. 2000: A Software for PopulationGenetics Data Analysis. Geneva: Genetics and Biometry Laboratory, University of Geneva.

Seah, L. H., N. H. Jeevan, M. I. Othman et al. 2003. STR data for the AmpFISTR Identifiler loci inthree ethic groups (Malay, Chinese, Indian) of the Malaysian population. Forensic Sci. Int.138:134–137.

Sewerin, B., F. J. Cuza, M. N. Szmulewicz et al. 2002. On the genetic uniqueness of the Ami aborigi-nes of Formosa. Am. J. Phys. Anthropol. 119:240–248.

Shepard, E. M., and R. J. Herrera. 2005. Iranian STR variation at the fringes of biogeographicaldemarcation. Forensic Sci. Int. (in press).

SPSS Inc. 2001. SPSS for Windows, Release 11.0.1. Chicago: SPSS Inc.Szczerkowska, Z., E. Kapinska, J. Wysocka et al. 2004. Northern Polish population data and forensic

usefulness of 15 autosomal STR loci. Forensic Sci. Int. 144:69–71.Tereba, A. 1999. Tools for analysis of population statistics. In Profiles in DNA, I. MacIver, ed.

Madison, WI: Promega Corporation, v. 2, pp. 14–16.Underhill, P. A. 2004. A synopsis of extant Y chromosome diversity in East Asia and Oceania. In

The Peopling of East Asia: Putting Together Archeology, Linguistics, and Genetics, L. Sagart,R. Blench, and A. Sanchez-Mazas, eds. London: Routledge Curzon, ch. 17, pp. 301–319.

Wang, C.-W., D.-P. Chen, C.-Y. Chen et al. 2003. STR data for the AmpFISTR SGM Plus and Profilerloci from Taiwan. Forensic Sci. Int. 138:119–122.

Watkins, W. S., A. R. Rogers, C. T. Ostler et al. 2003. Genetic variation among world populations:Inferences from 100 Alu insertion polymorphisms. Genome Res. 13:1607–1618.

PAGE 851................. 15768$ $CH7 02-21-06 11:53:49 PS