INTERPRETATION OF GENETIC DATA OBTAINED … OF GENETIC DATA OBTAINED FROM TURKISH ... chloroform...

1
INTERPRETATION OF GENETIC DATA OBTAINED FROM TURKISH NATIVE SHEEP BREEDS BY USE OF BIOINFORMATICS TOOLS Hande ACAR a , H. Alper DÖM a , Evren KOBAN b , Havva DĠNÇ a , ġ. Anıl DOĞAN c , Ceren CANER BERKMAN a , N. DilĢad DAĞTAġ a , Inci TOGAN a * a Middle East Technical Univesity Department of Biology, 06531 Ankara b TÜBĠTAK MAM GMBE, 41470 Gebze, Kocaeli c CECAD, University of Cologne, D50674, Cologne, Germany *e-mail: [email protected] ABSTRACT In the present study, within and among breed genetic variability in nine native Turkish sheep breeds (Sakız, Karagül, HemĢin, Çine Çaparı, Norduz, Herik, Dağlıç, Morkaraman, Kıvırcık) were analyzed based on 17 microsatellite loci. Before the analysis, the data was re-examined and corrected where necessary to assure the reliability of the microsatellite data and the analysis results. Then, various numbers of measures and statistical methods such as Fstatistics and heterozygosity, Factorial Correspondence Analysis(FCA), Principle Component Analysis(PCA), Neighbor Joining(NJ) Tree construction and Structure Analysis were employed for the analyses. According to the pairwise FST values all of the breeds differentiated from each other highly significantly (p<0,001). Morkaraman and Çine Çaparı exhibiting the highest pairwise FST value seemed to be genetically the most different breeds. The NJ Tree that was constructed with respect to Nei’s DA genetic distance supports FST results. However, geographically distant breeds are shown to be genetically similar which is supported by FCA and PCA analyses. All these results indicate that morphologically “pure” individuals of the breeds are genetically mixed and individuals of the breeds are largely overlapping. Structure analysis helps us to identify those individuals which are highly mixed. Hence, bioinformatics and genetics can be used to build conservation flocks which will be composed of relatively pure individuals of the breeds. AIMS OF THE STUDY To unveil the biodiversity pattern of native Turkish sheep breeds using microsatellite genetic markers (Doğan, 2009 and the data of current study). To contribute conservation studies, identifying mixed individuals by using bioinformatics tools in collaboration with biological methods. To test the usefullness of the new statistical softwares for phylogenic and evolutionary studies. INTRODUCTION According to the studies of Bruford (2003) and Zeder (2008), one of the first domesticated animals were sheeps. Domestication center is predicted as to be from northern Zagros to central Anatolia approximately 9000 BC. Hence it has the signature of being in the domestication center, Turkey is expected to have high genetic diversity, that are shown to be true after the analysis made here (Bruford et al., 2003; Uzun et al., 2006; Lawson-Handley et al., 2007; Peter et al.,2007; Koban et al., 2010; Doğan, 2009). Moreover, Clutton-Brock (1981) claim that domestic sheeps had spread from Turkey to Europe and even to everywhere. During this dispersion event, some of the important genetic diversity must be lost in different breeds and can still be existing in Turkish native sheep breeds. Due to the forementioned reasons, Turkish sheep breeds should have high priority in conservation studies. MATERIALS AND METHODS Samples are taken from at least 20 flocks for each breed and 2-3 old individuals are collected from each flock. DNA samples were isolated from blood via phenol:chloroform method (Sambrook et al., 1989). 17 loci (OarFCB226, OarFCB128, OarFCB304, OarVH72, OarJMP58, INRA063, MAF33, MAF214, BM8125, DYMS1, OarJMP29, ILSTS005, OarFCB20, OarFCB48, MAF65, MCM140, ILSTS011, MAF209) are used for microsatellite analyses and amplification was made by means of multiplex PCR. Genetic data obtained via fragment analysis. Following analysis and methods have been applied to the microsatellite data: Fstatistics(Goudet, 2002) and heterozygosity (Nei, 1987); Factorial Correspondence Analysis (FCA) (Lebart, 1984); Neigbour Joining (NJ) Tree construction (Saitou and Nei, 1987), Structure Analysis (Pritchard et al., 2000). RESULTS AND DISCUSSION Pairwise FST values between the populations were estimated by Arlequin package program (Excoffier et al., 2006). The FST values, used for a determinant of genetic differentiation between populations, can be seen in table. SAK KRG HEM ÇiÇ NOR HER DAG MRK SAKIZ 0.0000 KARAGÜL 0.05061* 0.0000 HEMŞİN 0.05913* 0.05188* 0.0000 CiNE CAPARI 0.07898* 0.06242* 0.03967* 0.0000 NORDUZ 0.03955* 0.03701* 0.01787* 0.02771* 0.0000 HERİK 0.03083* 0.02030* 0.03311* 0.04590* 0.01686* 0.0000 DAĞLIÇ 0.07193* 0.07758* 0.05824* 0.06971* 0.04195* 0.05747* 0.0000 MORKARAMAN 0.06972* 0.08301* 0.06010* 0.08302* 0.04133* 0.06124* 0.00976* 0.0000 KIVIRCIK 0.06102* 0.06296* 0.04417* 0.07021* 0.03502* 0.04265* 0.01596* 0.03205* ns: not significant Table 1.: Pairwise F ST values between the breeds. Table 2.: F IS values of breeds F IS SAKIZ 0.03305 ns KARAGUL -0.01969 ns HEMŞİN 0.10419 ns ÇİNEÇAPARI 0.13735 ns NORDUZ 0.04713 ns HERİK 0.10425 ns DAĞLIÇ 0.05680 ns MORKARAMAN 0.05998 ns KIVIRCIK 0.08956 ns According to the pairwise FST values, all of the breeds differentiated from each other highly significantly (p<0,001). Morkaraman and Çine Çaparı exhibiting the highest pairwise FST value seemed to be genetically the most different breeds. It is seen from the table that F IS values are not dissimilar to 0 for all of the breeds. That’s the evidence of the samples that are collected do not have high degree of relationship that can affect results of analysis. Furthermore, it proves that genetical data collection and processing is not faulty. According to 153 F IS values that have been got for each and every breed and individual there are only 12 negative values (less than 10%) and 5 of them had centered on Karagul breed. Negative values implies heterozygocity deficiency and even it may point presence of inbreeding which can only be observed for Karagül breeds and with inconsiderable amount. Figure: Exhibition of genetic distances among breeds by NJ tree method based on microsatellites (17) and D A genetic distance. Bootstrap values over 1000 have been placed on nodes. In this study, NJ trees were constructed to observe the relationship of the nine populations. Neighbor-joining trees were produced by using the NEIGHBOR program, 1000 bootstrap replicates were generated by the SEQBOOT program and a consensus tree was built with the CONSENSE program as implemented in PHYLIP 3.6 (Felsenstein, 1993). The constructed tree defines spatially remote breeds as “similar”. For instance, breeds Morkaraman (MRK) and Dağlıç (DAG) form a pair with the highest bootstrap value. Çine Çaparı and HemĢin is another pair with highest bootstrap value. However, geographically distant breeds are shown to be genetically similar which is supported by FCA analyses. All these results indicate that morphologically “pure” individuals of the breeds are genetically mixed and individuals of the breeds are largely overlapping. Structure analysis helps us to identify those individuals which are highly mixed. The Factorial Correspondence Analysis (FCA) is performed in order to visualize the individuals in multidimensional space and to explore the relationships between the individuals. GENETIX v. 4.03 (Belkhir et al., 1996) is used for the analysis. 3D FCA exhibition of breed averages show 56.63% of total variation. The axis on the figure displays highest variation. In the present study, sampling was made from nine sheep breeds: Sakız (Ġzmir, n=49), Karagül (Tokat, n=50), HemĢin (Artvin, n=48), Çine Çaparı (Aydın, n=41), Norduz (Van, n=46), Herik (Amasya, n=49), Dağlıç (Afyon, n=50), Morkaraman (Erzurum, n=50), Kıvırcık (Kırklareli, n=45). Figure: FCA results of group averages of studied nine breeds. The Structure v2.2 software was used to analyze the presence of population substructuring and to identify genetically distinct populations among the nine sheep breeds (Pritchard et al., 2000). The burning length was set to 100.000, which is accepted as adequate (Falush et al., 2007). The “Admixture model” is used in this study. Possible number of distinct populations, shown by K is equal to 5, iteration number is 20. Figure: Structure Bar Plot based on LnP(D) when K=5, population names are given below the graph. Belkhir, K et al. 1996. Genetix 4. 01, Windows™ Software for Population Genetics. Laboratoire Génome, Populations, Intéractions, University of Montpellier, France Bruford, MW et al. 2003. DNA Markers Reveal the Complexity of Livestock Domestication. Nat Rev Genet 4 (11): 900-10. Clutton-Brock, J. 1981. Domesticated Animals From Early Times. Heinemann/British Museum (Natural History) Doğan Ş A (2009) Reassessment Of Genetic Diversity In Native Turkish Sheep Breeds With Large Numbers Of Microsatellite Markers And Mitochondrial DNA (mtDNA). Master Thesis, Middle East Technical University, Ankara, Turkey. Excoffier, L et al. 2006. Arlequin Version 3.01. An Integrated Software Package for Population Genetics Data Analysis. Computational and Molecular Population Genetics Lab (CMPG), Institute of Zoology University of Berne Falush, D et al. 2007. Inference of Population Structure Using Multilocus Genotype Data: Dominant Markers and Null Alleles. Mol Ecol Notes 7 (4): 574-78. Felsenstein, J. 1993. Phylip (Phylogeny Inference Package) Version 3.5 C. Department of Genetics, University of Washington, Seattle 1118 Goudet J. Fstat 2.9.3.2. (2002) URL: http://www2.unil.ch/popgen/softwares/fstat.htm. Koban E, Trinidad P, Bruford MW, Togan I (2010) A Genetic Analysis of Marginal Sheep Breeds from Turkey. Manuscript submitted. Lawson Handley, L. J. et al. 2007. Genetic Structure of European Sheep Breeds. Heredity 99 620-31. Lebart, L et al. 1984. Multivariate Descriptive Analysis: Correspondence Analysis and Related Techniques for Large Matrices. Nei, M. 1987. Molecular Evolutionary Genetics. New York: Columbia University Press. Peter, C et al. 2007. Genetic Diversity and Subdivision of 57 European and Middle-Eastern Sheep Breeds. Anim Genet 38 (1): 37-44. Pritchard, JK et al. 2000. Inference of Population Structure Using Multilocus Genotype Data. Genetics 155 (2): 945-59. Saitou, N., and M. Nei. 1987. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4:406-425. Sambrook, J. et al. 1989. Molecular Cloning: A Laboratory Manual. 2 ed. Vol. 3, New York: Cold Spring Harbor Laboratory, Cold Spring Harbor, USA. Uzun, M. et al. 2006. Genetic Relationships Among Turkish Sheep. Genet Sel Evol 38 513-24. Zeder, MA. 2008. Domestication and Early Agriculture in the Mediterranean Basin: Origins, Diffusion, and Impact. Proc Natl Acad Sci U S A 105 (33): 11597-604. Note: This study was supported by Scientific and Technical Research Council of Turkey (TUBITAK) as a part of the project In Vitro Conservation and Preliminary Molecular Identification of Some Turkish Domestic Animal Genetic Resources-I (TURKHAYGEN-I) under the grant number 106G115. REFERENCES CONCLUSIONS Genetic data based on microsatellite markers reveal that native Turkish sheep breeds are not pure: Constructed phylogenetic tree does not overlap with the spatial distribution of the breeds. Breeds have a significant degree of mixture in parallel to the practice of frequent ram introduction from neighboring flock(s), irrespective of its breed. Use of bioinformatics tools in collaboration with biological methods will contribute conservation studies Sakız Karagül Hemşin ÇineÇaparı Norduz Herik Dağlıç Morkaraman Kıvırcık

Transcript of INTERPRETATION OF GENETIC DATA OBTAINED … OF GENETIC DATA OBTAINED FROM TURKISH ... chloroform...

Page 1: INTERPRETATION OF GENETIC DATA OBTAINED … OF GENETIC DATA OBTAINED FROM TURKISH ... chloroform method (Sambrook et al., 1989). 17 loci (OarFCB226, ... Slide 1 Author: aLpEr Created

INTERPRETATION OF GENETIC DATA OBTAINED FROM TURKISH

NATIVE SHEEP BREEDS BY USE OF BIOINFORMATICS TOOLSHande ACARa, H. Alper DÖMa, Evren KOBANb, Havva DĠNÇa, ġ. Anıl DOĞANc, Ceren CANER BERKMANa, N. DilĢad DAĞTAġa, Inci TOGANa*

aMiddle East Technical Univesity Department of Biology, 06531 Ankara bTÜBĠTAK MAM GMBE, 41470 Gebze, Kocaeli

cCECAD, University of Cologne, D‐50674, Cologne, Germany

*e-mail: [email protected]

ABSTRACTIn the present study, within and among breed genetic variability in nine native Turkish sheep breeds

(Sakız, Karagül, HemĢin, Çine Çaparı, Norduz, Herik, Dağlıç, Morkaraman, Kıvırcık) were analyzed based

on 17 microsatellite loci. Before the analysis, the data was re-examined and corrected where necessary to

assure the reliability of the microsatellite data and the analysis results. Then, various numbers of

measures and statistical methods such as Fstatistics and heterozygosity, Factorial Correspondence

Analysis(FCA), Principle Component Analysis(PCA), Neighbor Joining(NJ) Tree construction and Structure

Analysis were employed for the analyses.

According to the pairwise FST values all of the breeds differentiated from each other highly significantly

(p<0,001). Morkaraman and Çine Çaparı exhibiting the highest pairwise FST value seemed to be

genetically the most different breeds. The NJ Tree that was constructed with respect to Nei’s DA genetic

distance supports FST results. However, geographically distant breeds are shown to be genetically similar

which is supported by FCA and PCA analyses. All these results indicate that morphologically “pure”

individuals of the breeds are genetically mixed and individuals of the breeds are largely overlapping.

Structure analysis helps us to identify those individuals which are highly mixed. Hence, bioinformatics and

genetics can be used to build conservation flocks which will be composed of relatively pure individuals of

the breeds.

AIMS OF THE STUDY•To unveil the biodiversity pattern of native Turkish sheep breeds using microsatellite genetic markers

(Doğan, 2009 and the data of current study).

•To contribute conservation studies, identifying mixed individuals by using bioinformatics tools in

collaboration with biological methods.

•To test the usefullness of the new statistical softwares for phylogenic and evolutionary studies.

INTRODUCTION•According to the studies of Bruford (2003) and Zeder (2008), one of the first domesticated animals were

sheeps. Domestication center is predicted as to be from northern Zagros to central Anatolia approximately

9000 BC.

•Hence it has the signature of being in the domestication center, Turkey is expected to have high genetic

diversity, that are shown to be true after the analysis made here (Bruford et al., 2003; Uzun et al., 2006;

Lawson-Handley et al., 2007; Peter et al.,2007; Koban et al., 2010; Doğan, 2009).

•Moreover, Clutton-Brock (1981) claim that domestic sheeps had spread from Turkey to Europe and even

to everywhere. During this dispersion event, some of the important genetic diversity must be lost in

different breeds and can still be existing in Turkish native sheep breeds.

•Due to the forementioned reasons, Turkish sheep breeds should have high priority in conservation

studies.

MATERIALS AND METHODS

•Samples are taken from at least 20 flocks for each breed and 2-3 old individuals are collected from each

flock.

•DNA samples were isolated from blood via phenol:chloroform method (Sambrook et al., 1989).

17 loci (OarFCB226, OarFCB128, OarFCB304, OarVH72, OarJMP58, INRA063, MAF33, MAF214,

BM8125, DYMS1, OarJMP29, ILSTS005, OarFCB20, OarFCB48, MAF65, MCM140, ILSTS011, MAF209)

are used for microsatellite analyses and amplification was made by means of multiplex PCR. Genetic data

obtained via fragment analysis.

•Following analysis and methods have been applied to the microsatellite data: Fstatistics(Goudet, 2002)

and heterozygosity (Nei, 1987); Factorial Correspondence Analysis (FCA) (Lebart, 1984); Neigbour Joining

(NJ) Tree construction (Saitou and Nei, 1987), Structure Analysis (Pritchard et al., 2000).

RESULTS AND DISCUSSION Pairwise FST values between the populations were estimated by Arlequin package program (Excoffier et

al., 2006). The FST values, used for a determinant of genetic differentiation between populations, can be

seen in table.

SAK KRG HEM ÇiÇ NOR HER DAG MRK

SAKIZ 0.0000

KARAGÜL 0.05061* 0.0000

HEMŞİN 0.05913* 0.05188* 0.0000

CiNE CAPARI 0.07898* 0.06242* 0.03967* 0.0000

NORDUZ 0.03955* 0.03701* 0.01787* 0.02771* 0.0000

HERİK 0.03083* 0.02030* 0.03311* 0.04590* 0.01686* 0.0000

DAĞLIÇ 0.07193* 0.07758* 0.05824* 0.06971* 0.04195* 0.05747* 0.0000

MORKARAMAN 0.06972* 0.08301* 0.06010* 0.08302* 0.04133* 0.06124* 0.00976* 0.0000

KIVIRCIK 0.06102* 0.06296* 0.04417* 0.07021* 0.03502* 0.04265* 0.01596* 0.03205* ns: not significantTable 1.: Pairwise FST values between the breeds. Table 2.: FIS values of breeds

FIS

SAKIZ 0.03305 ns

KARAGUL -0.01969 ns

HEMŞİN 0.10419 ns

ÇİNEÇAPARI 0.13735 ns

NORDUZ 0.04713 ns

HERİK 0.10425 ns

DAĞLIÇ 0.05680 ns

MORKARAMAN 0.05998 ns

KIVIRCIK 0.08956 ns

According to the pairwise FST values, all of the breeds differentiated from each other highly significantly

(p<0,001). Morkaraman and Çine Çaparı exhibiting the highest pairwise FST value seemed to be

genetically the most different breeds.

It is seen from the table that FIS values are not dissimilar to 0 for all of the breeds. That’s the evidence of

the samples that are collected do not have high degree of relationship that can affect results of analysis.

Furthermore, it proves that genetical data collection and processing is not faulty.

According to 153 FIS values that have been got for each and every breed and individual there are only 12

negative values (less than 10%) and 5 of them had centered on Karagul breed. Negative values implies

heterozygocity deficiency and even it may point presence of inbreeding which can only be observed for

Karagül breeds and with inconsiderable amount.

Figure: Exhibition of genetic distances among breeds by

NJ tree method based on microsatellites (17) and DA

genetic distance. Bootstrap values over 1000 have been

placed on nodes.

In this study, NJ trees were constructed to observe the

relationship of the nine populations. Neighbor-joining trees

were produced by using the NEIGHBOR program, 1000

bootstrap replicates were generated by the SEQBOOT

program and a consensus tree was built with the

CONSENSE program as implemented in PHYLIP 3.6

(Felsenstein, 1993).

The constructed tree defines spatially remote breeds as

“similar”. For instance, breeds Morkaraman (MRK) and

Dağlıç (DAG) form a pair with the highest bootstrap value.

Çine Çaparı and HemĢin is another pair with highest

bootstrap value.

However, geographically distant breeds are shown to be genetically similar which is supported by FCA

analyses. All these results indicate that morphologically “pure” individuals of the breeds are genetically

mixed and individuals of the breeds are largely overlapping. Structure analysis helps us to identify those

individuals which are highly mixed.The Factorial Correspondence Analysis (FCA)

is performed in order to visualize the

individuals in multidimensional space and to

explore the relationships between the

individuals. GENETIX v. 4.03 (Belkhir et al.,

1996) is used for the analysis. 3D FCA

exhibition of breed averages show 56.63% of

total variation. The axis on the figure displays

highest variation.

•In the present study, sampling was made from nine

sheep breeds: Sakız (Ġzmir, n=49), Karagül (Tokat,

n=50), HemĢin (Artvin, n=48), Çine Çaparı (Aydın,

n=41), Norduz (Van, n=46), Herik (Amasya, n=49),

Dağlıç (Afyon, n=50), Morkaraman (Erzurum, n=50),

Kıvırcık (Kırklareli, n=45).

Figure: FCA results of group averages of studied nine breeds.

The Structure v2.2 software was used to analyze the presence of population substructuring and to identify

genetically distinct populations among the nine sheep breeds (Pritchard et al., 2000). The burning length

was set to 100.000, which is accepted as adequate (Falush et al., 2007). The “Admixture model” is used in

this study. Possible number of distinct populations, shown by K is equal to 5, iteration number is 20.

Figure: Structure Bar Plot based on LnP(D) when K=5, population names are given below the graph.•Belkhir, K et al. 1996. Genetix 4. 01, Windows™ Software for Population Genetics. Laboratoire Génome, Populations, Intéractions, University of Montpellier, France•Bruford, MW et al. 2003. DNA Markers Reveal the Complexity of Livestock Domestication. Nat Rev Genet 4 (11): 900-10.•Clutton-Brock, J. 1981. Domesticated Animals From Early Times. Heinemann/British Museum (Natural History)•Doğan Ş A (2009) Reassessment Of Genetic Diversity In Native Turkish Sheep Breeds With Large Numbers Of Microsatellite Markers And Mitochondrial DNA (mtDNA).Master Thesis, Middle East Technical University, Ankara, Turkey.•Excoffier, L et al. 2006. Arlequin Version 3.01. An Integrated Software Package for Population Genetics Data Analysis. Computational and Molecular Population GeneticsLab (CMPG), Institute of Zoology University of Berne•Falush, D et al. 2007. Inference of Population Structure Using Multilocus Genotype Data: Dominant Markers and Null Alleles. Mol Ecol Notes 7 (4): 574-78.•Felsenstein, J. 1993. Phylip (Phylogeny Inference Package) Version 3.5 C. Department of Genetics, University of Washington, Seattle 1118•Goudet J. Fstat 2.9.3.2. (2002) URL: http://www2.unil.ch/popgen/softwares/fstat.htm.•Koban E, Trinidad P, Bruford MW, Togan I (2010) A Genetic Analysis of Marginal Sheep Breeds from Turkey. Manuscript submitted.•Lawson Handley, L. J. et al. 2007. Genetic Structure of European Sheep Breeds. Heredity 99 620-31.•Lebart, L et al. 1984. Multivariate Descriptive Analysis: Correspondence Analysis and Related Techniques for Large Matrices.•Nei, M. 1987. Molecular Evolutionary Genetics. New York: Columbia University Press.•Peter, C et al. 2007. Genetic Diversity and Subdivision of 57 European and Middle-Eastern Sheep Breeds. Anim Genet 38 (1): 37-44.•Pritchard, JK et al. 2000. Inference of Population Structure Using Multilocus Genotype Data. Genetics 155 (2): 945-59.•Saitou, N., and M. Nei. 1987. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4:406-425.•Sambrook, J. et al. 1989. Molecular Cloning: A Laboratory Manual. 2 ed. Vol. 3, New York: Cold Spring Harbor Laboratory, Cold Spring Harbor, USA.•Uzun, M. et al. 2006. Genetic Relationships Among Turkish Sheep. Genet Sel Evol 38 513-24.•Zeder, MA. 2008. Domestication and Early Agriculture in the Mediterranean Basin: Origins, Diffusion, and Impact. Proc Natl Acad Sci U S A 105 (33): 11597-604.

Note:•This study was supported by Scientific and Technical Research Council of Turkey (TUBITAK) as a part of the project In Vitro Conservation and Preliminary MolecularIdentification of Some Turkish Domestic Animal Genetic Resources-I (TURKHAYGEN-I) under the grant number 106G115.

REFERENCES

CONCLUSIONS•Genetic data based on microsatellite markers reveal that native Turkish sheep breeds are not pure:

Constructed phylogenetic tree does not overlap with the spatial distribution of the breeds.

Breeds have a significant degree of mixture in parallel to the practice of frequent ram introduction

from neighboring flock(s), irrespective of its breed.

•Use of bioinformatics tools in collaboration with biological methods will contribute conservation studies

Sakız Karagül Hemşin ÇineÇaparı Norduz Herik Dağlıç Morkaraman Kıvırcık