Parentage versus two-generation analyses for estimating ...download.xuebalib.com/coeSu2npqK9.pdf ·...

14
Molecular Ecology (2005) 14, 2525–2537 doi: 10.1111/j.1365-294X.2005.02593.x © 2005 Blackwell Publishing Ltd Blackwell Publishing, Ltd. Parentage versus two-generation analyses for estimating pollen-mediated gene flow in plant populations JAROSLAW BURCZYK and TOMASZ E. KORALEWSKI Department of Genetics, Institute of Biology and Environmental Protection, University of Bydgoszcz, 85– 064 Bydgoszcz, Poland Abstract Assessment of contemporary pollen-mediated gene flow in plants is important for various aspects of plant population biology, genetic conservation and breeding. Here, through simulations we compare the two alternative approaches for measuring pollen-mediated gene flow: (i) the NEIGHBORHOOD model — a representative of parentage analyses, and (ii) the recently developed TWOGENER analysis of pollen pool structure. We investigate their prop- erties in estimating the effective number of pollen parents (N ep ) and the mean pollen dis- persal distance (δ). We demonstrate that both methods provide very congruent estimates of N ep and δ, when the methods’ assumptions considering the shape of pollen dispersal curve and the mating system follow those used in data simulations, although the NEIGHBORHOOD model exhibits generally lower variances of the estimates. The violations of the assumptions, especially increased selfing or long-distance pollen dispersal, affect the two methods to a different degree; however, they are still capable to provide comparable estimates of N ep . The NEIGHBORHOOD model inherently allows to estimate both self-fertilization and outcrossing due to the long-distance pollen dispersal; however, the TWOGENER method is particularly sensitive to inflated selfing levels, which in turn may confound and sup- press the effects of distant pollen movement. As a solution we demonstrate that in case of TWOGENER it is possible to extract the fraction of intraclass correlation that results from outcrossing only, which seems to be very relevant for measuring pollen-mediated gene flow. The two approaches differ in estimation precision and experimental efforts but they seem to be complementary depending on the main research focus and type of a population studied. Keywords: effective number, gene flow, neighborhood model, paternity, pollen dispersal, twogener Received 25 November 2004; revision received 16 February 2005; accepted 1 April 2005 Introduction Gene flow within and among populations is an important component of plant reproductive systems, affecting both the genetic structure and adaptation of populations (Rieseberg & Burke 2001; Lenormand 2002). Plants, especially forest trees, can disperse their genes over large distances via pollen (Hjelmroos 1991; Lindgren et al . 1995; Di-Giovanni et al . 1996; Nason et al . 1998; Rogers & Levetin 1998). Therefore, studies of pollen-mediated gene flow have received much attention in recent decades (Ellstrand 1992; Sork et al . 1999; Hamrick & Nason 2000; Smouse & Sork 2004). Direct assessment of contemporary, pollen- mediated gene flow is important because it can provide information on a population’s current dynamics and may shed light on the ecological constraints that affect pollen dispersal (Sork et al . 1999). Several methods have been developed for directly measuring pollen-mediated gene flow, most of which rely on paternity exclusion (Smith & Adams 1983; Devlin & Ellstrand 1990; Burczyk & Chybicki 2004) or paternity assignment (Devlin et al . 1988; Chase et al . 1996; Dow & Ashley 1998; Streiff et al . 1999). However, paternity-based methods are usually laborious, require highly variable genetic markers, and are inherently limited by not being able to sample all potential fathers in large continuous populations. Smouse et al . (2001) proposed the twogener Correspondence: Jaroslaw Burczyk, Fax: (+ 48-52) 360-82-06; E-mail: [email protected]

Transcript of Parentage versus two-generation analyses for estimating ...download.xuebalib.com/coeSu2npqK9.pdf ·...

  • Molecular Ecology (2005)

    14

    , 2525–2537 doi: 10.1111/j.1365-294X.2005.02593.x

    © 2005 Blackwell Publishing Ltd

    Blackwell Publishing, Ltd.

    Parentage versus two-generation analyses for estimating pollen-mediated gene flow in plant populations

    JAROSLAW BURCZYK and TOMASZ E . KORALEWSKI

    Department of Genetics, Institute of Biology and Environmental Protection, University of Bydgoszcz, 85–064 Bydgoszcz, Poland

    Abstract

    Assessment of contemporary pollen-mediated gene flow in plants is important for variousaspects of plant population biology, genetic conservation and breeding. Here, throughsimulations we compare the two alternative approaches for measuring pollen-mediatedgene flow: (i) the

    NEIGHBORHOOD

    model — a representative of parentage analyses, and (ii) therecently developed

    TWOGENER

    analysis of pollen pool structure. We investigate their prop-erties in estimating the effective number of pollen parents (

    N

    ep

    ) and the mean pollen dis-persal distance (δδδδ

    ). We demonstrate that both methods provide very congruent estimates of

    N

    ep

    and δδδδ

    , when the methods’ assumptions considering the shape of pollen dispersal curveand the mating system follow those used in data simulations, although the

    NEIGHBORHOOD

    model exhibits generally lower variances of the estimates. The violations of the assumptions,especially increased selfing or long-distance pollen dispersal, affect the two methodsto a different degree; however, they are still capable to provide comparable estimates of

    N

    ep

    . The

    NEIGHBORHOOD

    model inherently allows to estimate both self-fertilization andoutcrossing due to the long-distance pollen dispersal; however, the

    TWOGENER

    method isparticularly sensitive to inflated selfing levels, which in turn may confound and sup-press the effects of distant pollen movement. As a solution we demonstrate that in case of

    TWOGENER

    it is possible to extract the fraction of intraclass correlation that results fromoutcrossing only, which seems to be very relevant for measuring pollen-mediated geneflow. The two approaches differ in estimation precision and experimental efforts but theyseem to be complementary depending on the main research focus and type of a populationstudied.

    Keywords

    : effective number, gene flow,

    neighborhood

    model, paternity, pollen dispersal,

    twogener

    Received 25 November 2004; revision received 16 February 2005; accepted 1 April 2005

    Introduction

    Gene flow within and among populations is an importantcomponent of plant reproductive systems, affectingboth the genetic structure and adaptation of populations(Rieseberg & Burke 2001; Lenormand 2002). Plants,especially forest trees, can disperse their genes over largedistances via pollen (Hjelmroos 1991; Lindgren

    et al

    . 1995;Di-Giovanni

    et al

    . 1996; Nason

    et al

    . 1998; Rogers & Levetin1998). Therefore, studies of pollen-mediated gene flowhave received much attention in recent decades (Ellstrand1992; Sork

    et al

    . 1999; Hamrick & Nason 2000; Smouse &

    Sork 2004). Direct assessment of contemporary, pollen-mediated gene flow is important because it can provideinformation on a population’s current dynamics and mayshed light on the ecological constraints that affect pollendispersal (Sork

    et al

    . 1999).Several methods have been developed for directly

    measuring pollen-mediated gene flow, most of whichrely on paternity exclusion (Smith & Adams 1983; Devlin& Ellstrand 1990; Burczyk & Chybicki 2004) or paternityassignment (Devlin

    et al

    . 1988; Chase

    et al

    . 1996; Dow &Ashley 1998; Streiff

    et al

    . 1999). However, paternity-basedmethods are usually laborious, require highly variablegenetic markers, and are inherently limited by not beingable to sample all potential fathers in large continuouspopulations. Smouse

    et al

    . (2001) proposed the

    twogener

    Correspondence: Jaroslaw Burczyk, Fax: (+ 48-52) 360-82-06;E-mail: [email protected]

  • 2526

    J . B U R C Z Y K and T . E . K O R A L E W S K I

    © 2005 Blackwell Publishing Ltd,

    Molecular Ecology

    , 14, 2525–2537

    approach, which attempts to estimate the extent of pollenmovement based on estimates of genetic differentiationamong pools of pollen gametes effectively fertilizingovules of different mother plants. Because the

    twogener

    approach requires less sampling and genotyping com-pared to conventional paternity-based methods, it seemsto be an efficient way to estimate long-distance gene flow incontinuous plant populations (Sork

    et al

    . 2002a; Austerlitz

    et al

    . 2004).Smouse & Sork (2004) recently reviewed the strengths

    and weaknesses of paternity-based and

    twogener

    methods for estimating pollen-mediated gene flow. Theseapproaches rely on different assumptions and may yielddifferent estimates of gene flow. Although applications ofpaternity-based methods suggest that much of the pollen-mediated gene flow occurs over large distances (seeAdams & Burczyk 2000; Hamrick & Nason 2000; Burczyk

    et al

    . 2004a; DiFazio

    et al

    . 2004; Smouse & Sork 2004 forreviews), the studies based on

    twogener

    method suggestthat pollen-mediated gene flow is more restricted (Dyer &Sork 2001; Smouse

    et al

    . 2001; Sork

    et al

    . 2002a; Smouse &Sork 2004; but see Robledo-Arnuncio

    et al

    . 2004). Thesedifferences may result from corresponding differences inthe species or populations studied, assumptions used, orthe contrasting statistics that were used to judge the degreeof pollen flow (Smouse & Sork 2004). To resolve these dis-crepancies, it would be desirable to compare both approachesusing the same materials. This is difficult, however,because paternity-based methods work best when largenumbers of offspring are sampled from a few mothers,whereas the

    twogener

    approach works best when fewoffspring are sampled from a large number of mothers(Smouse & Sork 2004). Although some authors found thatestimates of pollen-mediated gene flow are similar usingthe two methods (Austerlitz

    et al

    . 2004), they have not beencompared using computer simulations, which allow bothmethods to be compared simultaneously under controlledsettings.

    In this study, we used computer simulations to comparethe

    neighborhood

    model (a representative of the paternity-based methods) (Adams & Birkes 1991; Burczyk

    et al

    . 2002)with the

    twogener

    approach (Smouse

    et al

    . 2001). Wewere particularly interested in determining whether bothmethods provide similar estimates of mean pollen disper-sal distance and effective number of pollen parents. Wealso studied how violations of the assumptions affected theaccuracy of the pollen flow estimates. Although the

    neigh-borhood

    model and

    twogener

    approach were recentlymodified to address their previous shortcomings (Austerlitz& Smouse 2002; Burczyk

    et al

    . 2002; Dyer

    et al

    . 2004), herewe deal only with the simplest versions of the

    neighborhood

    and

    twogener

    models (Adams & Birkes 1991; Smouse

    et al

    . 2001), which seems sufficient for the straightforwardcomparison of the two methods.

    Materials and methods

    Detailed descriptions of the

    neighborhood

    model and

    twogener

    approach are available elsewhere (

    neighborhood

    model: Adams & Birkes 1991; Burczyk

    et al

    . 1996, 2002;

    twogener

    : Austerlitz & Smouse 2001a; Smouse

    et al

    . 2001;Sork

    et al

    . 2002a). In both methods, the paternal contri-bution (i.e. pollen gamete haplotype) must be determinedfor a sample of viable embryos collected from knownmother plants. These pollen haplotypes must be measuredusing neutral genetic markers such as allozymes ormicrosatellites. Both methods can be used in cases wherethe paternal haplotype can be measured directly andunambiguously (e.g. from the haploid megagametophyteof most conifers), or when the paternal haplotype isambiguous (e.g. inferred from the genotypes of the offspringembryo and female parent). In the latter case, the paternalcontribution will be ambiguous when the mother andoffspring are both heterozygotes and share the samealleles. To facilitate our computer simulations, we assumedthat the paternal contributions could be determinedunambiguously. Nevertheless, the outcomes of the analysesare relevant to ambiguous assays as well.

    NEIGHBORHOOD

    model

    In the

    neighborhood

    model, the minimum data setconsists of progeny haplotype arrays (i.e. pollen gametes)sampled from several mother plants, the genotypes of allindividuals (and their locations) within the local populationthat could potentially sire sampled progeny, and allelefrequencies in the surrounding populations that mightbe the source of immigrant pollen gametes. In the

    neighborhood

    model, we assume that a measured pollengamete may originate from one of three sources: (i) self-fertilization (in monoecious and self-compatible plants)with probability

    s

    , (ii) pollination by a distant unknownfather located outside the local population (neighbourhood)with probability

    m

    , and (iii) pollination by local andgenotyped father with probability 1 –

    s

    m

    . Therefore, theprobability of observing a particular pollen gamete havinggenotype

    g

    i

    is defined as:

    (eqn

    1)

    where

    P

    (

    g

    i

    |

    M

    ),

    P

    (

    g

    i

    |

    B

    ) and

    P

    (

    g

    i

    |

    F

    j

    ) are Mendelian trans-ition probabilities (Devlin

    et al

    . 1988; Adams 1992).

    P

    (

    g

    i

    |

    M

    )is the probability that the pollen gamete having haplotype

    g

    i

    comes from the mother having genotype

    M

    ;

    P

    (

    g

    i

    |

    B

    ) isthe probability that the pollen gamete comes from a distantand unknown father in the background population; and

    P

    (

    g

    i

    |

    F

    j

    ) is the probability that the pollen gamete comesfrom one of the

    r

    local fathers having genotype

    F

    j

    .

    P

    (

    g

    i

    |

    B

    )

    P g s P g M m P g B s m P g Fi i i jj

    r

    i j( ) ( | ) ( | ) ( ) ( | )= ⋅ + ⋅ + − −=∑1

    1

    λ

  • P A R E N T A G E V S . T W O G E N E R

    2527

    © 2005 Blackwell Publishing Ltd,

    Molecular Ecology

    , 14, 2525–2537

    is calculated as the product of the background allelefrequencies of the alleles forming the pollen gamete haplo-type

    g

    i

    . The parameter

    λ

    j

    , which represents the relativemating success of the

    j

    -th father growing within theneighbourhood of the mother plant

    M

    , is related to thedistance between the father and mother plants accordingto the exponential distribution:

    (eqn 2)

    where

    d

    j

    is the distance between the

    j

    -th father and themother plant, and

    β

    is the dispersal parameter thatdescribes the relationship between the distance of pollendispersal within the neighbourhood and individual malemating success (Adams 1992).

    The likelihood function for

    K

    mother plants is

    (eqn 3)

    where

    n

    k

    is the number of offspring sampled from the

    k

    -thmother. The parameters of interest (

    s

    ,

    m

    ,

    β

    ) are estimatedusing maximum-likelihood methods. Once the pollendispersal parameter (

    β

    ) has been estimated, both the meaneffective pollen dispersal and effective number of pollenparents within neighbourhoods may be derived (Adams &Birkes 1991; Burczyk

    et al

    . 1996). The

    neighborhood

    modelappears to be useful for estimating gene flow parameters inisolated and continuous populations (Burczyk

    et al

    . 1996,2004b; Latouche-Halle

    et al

    . 2004). Recent versions of the

    neighborhood

    model are well suited to ambiguous assaysand allow pollen-mediated gene flow to be estimated simul-taneously with background pollen pool allele frequencies(Burczyk

    et al

    . 2002; Burczyk & Chybicki 2004).

    TWOGENER

    The

    twogener

    model is a hybrid approach that combinestraditional paternity analysis and genetic structure analysis(Smouse

    et al. 2001). Mother plants scattered throughout thelandscape are used to ‘sample’ different sets of pollen donors,and the differentiation of pollen pools among the mothers(pollen structure) is measured using the intraclass correlationcoefficient (ΦFT). The intraclass correlation coefficient, whichis the correlation of male gametes drawn at random fromthe same mother, relative to those drawn at random fromthe population as a whole, is calculated using analysisof molecular variance (amova, Excoffier et al. 1992). Ifone assumes that ΦFT also estimates the probability thattwo paternal alleles are identical by descent, then it can beused to estimate the effective number of the pollen parents:

    (eqn 4)

    Austerlitz & Smouse (2001a) showed that assuming aspecific function of pollen dispersal (bivariate normal orexponential), the mean distance of pollen dispersal (δ) canbe estimated directly from ΦFT provided that the sampledmothers are far enough apart (ideally ¥ > 5δ, yet ¥ > 3δ isreasonable, where ¥ is the mean distance between mothers).The original twogener model has been recently modifiedto overcome some of its initial shortcomings (Austerlitz &Smouse 2002; Austerlitz et al. 2004; Dyer et al. 2004; Smouse& Sork 2004).

    Simulations

    The neighborhood model and twogener approach useslightly different sampling schemes. The optimal schemefor the twogener approach is to sample a small number ofseeds from each of many mothers (Austerlitz & Smouse2002), but in the neighborhood model (and paternity-basedmethods in general), it is better to sample more seeds fromfewer parents (Burczyk et al. 1996; Streiff et al. 1999). Weused a sampling scheme that is reasonable for bothapproaches. We generated 20 (= K) mothers and 50 (= nk)successful pollen gametes for each mother, for a total of1000 offspring per simulation. All adults were distributedon a grid of 1 × 1 units. The maternal plants were organizedinto five groups (similarly to Smouse et al. 2001), but thedistance between the borders of the neighbouring groupswas set to 80 units and the distance between the motherswithin each group was at least 10 units. In our simulations,the mean distance between the mothers was ¥ ≈ 117 units.Therefore, we met the ¥ > 5δ criterion (Austerlitz & Smouse2001a) for the range of dispersal parameters used in oursimulations (see Results) which enables estimating ofmean distance of pollen dispersal (δ) directly from ΦFT. Forthe neighborhood model, it would be difficult to use thissampling density in practice, but it is feasible in computersimulations. For the genetic data, we simulated fivemicrosatellite-like loci, each of which had 10 alleles withfrequency of 0.1 (EP = 0.9996). Genotypes were simulatedassuming Hardy–Weinberg equilibrium (HWE), no linkage,no mutations, and no genotyping errors.

    The generation of pollen gametes assumed that the com-position of pollen pools from sampled mothers dependssolely on the distances from the surrounding fathers, andthat pollen dispersal follows an exponential distribution:

    (eqn 5)

    where x is the distance of pollen dispersal, and b is theslope of the dispersal curve. This is a one-parameterexponential model that can be considered a specific case ofthe more general two-parameter exponential power modeloften used for pollen dispersal studies (Austerlitz et al.2004). We chose the one-parameter exponential distribution

    λβ

    βj

    j

    ll

    r

    d

    d

    exp

    exp

    =

    =∑

    1

    L s m L s m P gkk

    K

    ii

    n

    k

    K k( , , ) ( , , ) ( ),β β= =

    = ==∏ ∏∏

    1 11

    ΦΦFT FT

    .≈ ⇒ ≈1

    21

    2NN

    epep

    f x b bx( ) exp( ),= −

  • 2528 J . B U R C Z Y K and T . E . K O R A L E W S K I

    © 2005 Blackwell Publishing Ltd, Molecular Ecology, 14, 2525–2537

    for its simplicity, and because this distribution has alreadybeen integrated into paternity-based methods (equation 2)(Adams & Birkes 1991; Smouse et al. 1999) and the twogenermodel (Austerlitz & Smouse 2001a; Smouse et al. 2001).Therefore, the method used to generate the data wasidentical for both methods of gene flow analysis.

    The function f (x) is the probability density distributionthat describes the probability that a successful pollen gam-ete came from an individual located at distance x. To usethis function for generating data, we translated it into acumulative distribution function C(u) that describes theprobability that pollen came from males located at dis-tances from 0 to u (i.e. what proportion of pollen pool camefrom males located at distances from 0 to u):

    (eqn 6)

    where P(x) is the probability density function that pollencame from any father located at distance x:

    (eqn 7)

    Then, since

    (eqn 8)

    therefore

    (eqn 9)

    and finally

    (eqn 10)

    For each simulation, the first step was to place the motherson the grid. The next step was to place the pollen donorson the grid. While setting a paternal tree’s position, a C(u)value was generated by choosing a random numberbetween 0 and 0.999999. Then, the distance u was foundnumerically using equation (10). Next, an angle was drawnas a random number between 0 and 360. The two values,i.e. the distance from the maternal tree u, and the angle,were then used to obtain the Cartesian coordinates of thepollen parent, which were rounded to integers to matchthe 1 × 1 unit grid. Therefore, the pollen dispersal distancewas not limited by the size of the virtual population (e.g.100 × 100 individuals as in Smouse et al. 2001; Austerlitzet al. 2004), and the maximum pollen dispersal distancewas determined by the cumulative mating success of allfathers. In our simulations, the maximum distance wasequal to the radius of a circle delimiting an area from which

    99.9999% of the pollen arrived (100% corresponds to infin-ity), according to the distribution function.

    After assigning the position of every pollen donor, theirdiploid genotypes were generated as described above. Pol-len haplotypes from each father were generated based onMendelian rules. When a pollen parent was selected morethan once, the same diploid genotype was used to generatenew pollen haplotypes. Because the species was assumedto be monoecious, a mother was allowed to act as a pollenparent for itself (self-fertilization) and for other mothers.To comply with the methodology of the neighborhoodmodel, the neighbourhood of every mother plant (about5.66 units in radius) was populated with the remainingadults that were not chosen as pollen donors during thegamete generation step. Therefore, the neighbourhood foreach mother consisted of 100 adults as potential fathers,which is consistent with typical neighborhood modelanalyses. If the wide spacings we simulated were used inreal field studies, 2000 adults would need to be genotyped(i.e. 100 adults/neighbourhood × 20 neighbourhoods),plus the offspring. For the twogener model, in contrast, only20 adults would need to be genotyped. Nevertheless, be-cause the sampling of mothers is usually optimized inpaternity-based analyses, and because overlapping neigh-bourhoods can be used, the number of adult genotypesneeded is usually much less than 2000 (Burczyk et al. 1996;Streiff et al. 1999). For each combination of parameters, 200simulations were performed. The mean and standard devi-ation of each parameter estimate was then calculated fromthese 200 replicates.

    Offspring samples were generated under the followingscenarios: (I) variable pollen dispersal (b = 0.1–1.0, by 0.1)with no selfing, (II) variable pollen dispersal with randomselfing (probability of selfing equal to the probability ofdispersal at distance 0), (III) constant pollen dispersal(b = 0.5) but variable selfing (s = 0–0.5, by 0.1). Finally, weintroduced long-distance pollen dispersal (LDPD) (scenarioIV), which does not follow the exponential distributionand whose origin is unknown. Here, because it is veryunlikely to have two gametes originating from the samemale, we generated the proportion (M) of pollen gametesrandomly based on the prior allele frequency distribution,and then the remaining proportion (1 – M) as describedearlier. M is the proportion of pollen immigration thatcomes from outside of the local population, whereas mis the immigration that comes from outside of the localneighbourhood. Details on parameter values used in eachscenario are given in the Results.

    Parameter estimation

    We wished to estimate mating parameters that could beeasily compared between the two methods of gene flowanalysis. Therefore, we chose to estimate and compare the

    C u P x dx

    u

    ( ) ( )=�0

    P xxf x

    xf x dx

    bx bx

    bx bx dx

    ( ) ( )

    ( )

    exp( )

    exp( )

    .= =−

    ∞ ∞

    � �0 0

    �0

    01 1

    ∞− = − −

    − =bx bx dx x

    bbx

    bexp( ) exp( )|

    P x b x bx( ) exp( ),= −2

    C u P x dx b x bx dx b ub

    bu

    u u

    ( ) ( ) exp( ) exp( )= = − = − −

    − +� �

    0 0

    2 1 1

  • P A R E N T A G E V S . T W O G E N E R 2529

    © 2005 Blackwell Publishing Ltd, Molecular Ecology, 14, 2525–2537

    effective number of pollen parents and the mean pollendispersal distance (NeNM and δNM for neighborhood model,and NeTG and δTG for twogener) because these estimatesshould converge given the assumptions we used. In thetwogener model, the mean pollen dispersal distance δTGwas calculated using equation (11), which assumes anexponential distribution and population density (d) of 1.0.Equation (11) was derived from the equations presentedby Austerlitz & Smouse (2001a).

    (eqn 11)

    In the neighborhood model, the mean distance ofpollen dispersal and effective number of pollen parentsare typically estimated using outcross matings alone(Adams & Birkes 1991; Burczyk et al. 1996). To accountfor the contribution of the paternal trees growing out-side the neighbourhood area, the mean distance of pollendispersal (δNM) was extrapolated based on the shape ofpollen dispersal function estimated within the neighbour-hood (i.e. based on the estimate of β). Note that β is the esti-mator of b from equation (5) (β = –b). The parameter δNMmight be calculated for all matings at distances from 0 toinfinity (thus including selfing), as the expected value of x:

    (eqn 12)

    However, for our simulations, we are interested in separ-ating selfing from calculations of mean pollen dispersal.If t is the mean distance that differentiates selfing fromoutcrossing, then the mean distance of outcrossing pollencan be estimated as:

    (eqn 13)

    However, the distance t varies depending on the directionof the location of fathers around a mother, and ranges fromt = 0.5, for cardinal directions (0, 90, 180, 270 degrees), up tot = 0.71, for noncardinal directions (45, 135, 225, 315). In oursimulations, the mean t was close to 0.6. The estimates of meandistances of pollen dispersal were compared to the actualdistances of pollen dispersal generated in simulations.

    In both models, the effective number of pollen parentscan be estimated as the inverse of the probability of pater-nal identity within maternal sibships. In the twogenermodel, the effective number NeTG was calculated based onequation (4). Occasionally, when the actual pollen disper-sal is extensive, ΦFT estimates might be negative, which isgenerally interpreted as lack of pollen structure (Robledo-Arnuncio et al. 2004). Although we preferred not to restrictour estimates of ΦFT to positive values, it is unclear how tointerpret estimates of Ne based on negative values of ΦFT.Therefore, for the twogener model, the effective popula-tion number of pollen parents NeTG was calculated based

    on the mean value of ΦFT over the simulations, and thestandard deviation of NeTG was calculated based on thestandard deviation of ΦFT.

    In the neighborhood model, we estimated the totaleffective number of pollen parents (including selfing andpollen immigration from outside of the neighbourhood)using the extrapolation proposed by Burczyk et al. (1996):

    (eqn 14)

    where NeNM(k) is the effective number of pollen parentsaround the k-th female, λj is the relative mating success ofthe j-th father in the neighbourhood of that female basedon equation (2) given the estimate of β, and s and m are thepopulation-wide estimates of selfing and pollen immigrationfrom outside the neighbourhood. The NeNM was averagedacross all females.

    The NeTG and NeNM estimates were compared to theexpectation of Nep. For random selfing (simulation scenarioII) and a uniform distribution of individuals in space (withdensity d = 1), the probability of paternal identity in asingle maternal sibship is given by:

    (eqn 15)

    A more general expectation that would take into accountvariable selfing (simulation scenarios I and III) and variableextents of LDPD (scenario IV) can be obtained as follows:

    (eqn 16)

    where t is the mean distance that differentiates selfing fromoutcrossing explained above.

    The parameters of the neighborhood model were esti-mated based on maximum-likelihood methods using theNewton–Raphson algorithm, assuming that the allele fre-quencies in the background pollen pool are identical to theprior allele frequencies used for generating the data. Thetwo sets of parameter estimates were compared betweenthe two methods and with the expected values. All simu-lations were performed using a specific computer programn2g written by T. E. K. in Object Pascal (Delphi 7, BorlandSoftware), and is available from the authors upon request.

    Results

    Impact of the dispersal parameter

    We conducted our first analyses assuming that the specieswas self-incompatible (scenario I). The dispersal parameter

    δπ

    TG

    FT

    .=12

    1

    Φ

    δβNM

    ( ) .= = −

    �0

    2xP x dx

    δ β ββNM

    ( ) ( ) .= = − + − + −

    −�t

    xP x dx t t t1 221 2

    N s s me k jNM( ) /[ ( ) ]= + − − ∑1 12 2 2λ

    1 12 80

    22

    N xP x dx

    b

    ep

    ( ) .= =

    �π π

    1 12

    1

    4

    12

    21

    2 2 2

    22 2

    N xP x dx s M s

    bbt

    bts M s

    ep t

    ( ) ( )

    exp( ) ( )

    =

    ⋅ − − +

    =+

    ⋅ − − +

    �π

    π

  • 2530 J . B U R C Z Y K and T . E . K O R A L E W S K I

    © 2005 Blackwell Publishing Ltd, Molecular Ecology, 14, 2525–2537

    Tab

    le 1

    The

    impa

    ct o

    f dis

    pers

    al p

    aram

    eter

    b o

    n ac

    tual

    mea

    n po

    llen

    disp

    ersa

    l dis

    tanc

    e X,

    exp

    ecte

    d ef

    fect

    ive

    num

    ber

    of p

    olle

    n pa

    rent

    s N

    ep a

    nd m

    atin

    g pa

    ram

    eter

    s ob

    tain

    ed th

    roug

    h th

    en

    eigh

    borh

    oo

    d a

    nd t

    wo

    gen

    er m

    odel

    s, w

    hen

    self

    -fer

    tiliz

    atio

    n is

    not

    allo

    wed

    bN

    epX

    nei

    ghbo

    rho

    od

    two

    gen

    er

    NeN

    Mδ N

    FTN

    eTG

    δ TG

    0.1

    2530

    .10

    20.0

    5 (0

    .44)

    0.89

    0 (0

    .011

    )−0

    .123

    (0.0

    62)

    8242

    .66

    (172

    7.16

    )16

    .35

    (16.

    45)

    0.00

    020

    (0.0

    0098

    )24

    93.3

    3 (2

    602.

    00)

    17.1

    5 (1

    7.49

    )0.

    264

    4.15

    10.0

    6 (0

    .22)

    0.68

    9 (0

    .015

    )−0

    .205

    (0.0

    40)

    962.

    36 (9

    4.07

    )9.

    85 (2

    .40)

    0.00

    088

    (0.0

    0104

    )56

    7.21

    (201

    9.14

    )12

    .83

    (16.

    96)

    0.3

    294.

    316.

    75 (0

    .13)

    0.49

    9 (0

    .015

    )−0

    .302

    (0.0

    33)

    337.

    45 (2

    3.15

    )6.

    72 (0

    .81)

    0.00

    157

    (0.0

    0103

    )31

    9.31

    (369

    .45)

    9.16

    (8.4

    8)0.

    417

    1.52

    5.11

    (0.1

    1)0.

    347

    (0.0

    15)

    −0.3

    99 (0

    .029

    )17

    5.82

    (11.

    24)

    5.13

    (0.3

    9)0.

    0029

    0 (0

    .001

    16)

    172.

    32 (8

    2.23

    )6.

    15 (4

    .92)

    0.5

    114.

    494.

    15 (0

    .08)

    0.23

    4 (0

    .013

    )−0

    .498

    (0.0

    26)

    110.

    29 (5

    .75)

    4.15

    (0.2

    2)0.

    0046

    7 (0

    .001

    22)

    106.

    39 (2

    9.49

    )4.

    23 (0

    .59)

    0.6

    83.3

    93.

    51 (0

    .08)

    0.15

    5 (0

    .012

    )−0

    .591

    (0.0

    26)

    78.1

    2 (4

    .04)

    3.54

    (0.1

    5)0.

    0062

    8 (0

    .001

    39)

    79.6

    8 (1

    8.62

    )3.

    64 (0

    .46)

    0.7

    64.5

    73.

    05 (0

    .07)

    0.10

    1 (0

    .010

    )−0

    .693

    (0.0

    26)

    58.1

    7 (2

    .67)

    3.06

    (0.1

    1)0.

    0084

    3 (0

    .001

    48)

    59.3

    0 (1

    0.73

    )3.

    11 (0

    .28)

    0.8

    52.3

    32.

    72 (0

    .06)

    0.06

    5 (0

    .008

    )−0

    .793

    (0.0

    26)

    45.6

    1 (2

    .10)

    2.71

    (0.0

    8)0.

    0108

    6 (0

    .001

    64)

    46.0

    3 (7

    .11)

    2.73

    (0.2

    1)0.

    943

    .93

    2.46

    (0.0

    6)0.

    042

    (0.0

    07)

    −0.8

    91 (0

    .032

    )37

    .25

    (2.0

    1)2.

    45 (0

    .08)

    0.01

    345

    (0.0

    0188

    )37

    .16

    (5.2

    8)2.

    45 (0

    .17)

    1.0

    37.9

    32.

    26 (0

    .04)

    0.02

    6 (0

    .005

    )−0

    .986

    (0.0

    28)

    31.2

    6 (1

    .30)

    2.25

    (0.0

    6)0.

    0157

    0 (0

    .001

    84)

    31.8

    4 (3

    .79)

    2.26

    (0.1

    4)

    NeN

    M a

    nd N

    eTG

    are

    the

    esti

    mat

    es o

    f eff

    ecti

    ve n

    umbe

    r of p

    olle

    n pa

    rent

    s, δ

    NM

    and

    δT

    G a

    re th

    e es

    tim

    ates

    of m

    ean

    polle

    n di

    sper

    sal o

    btai

    ned

    thro

    ugh

    the

    nei

    ghbo

    rho

    od

    and

    tw

    oge

    ner

    mod

    els,

    re

    spec

    tive

    ly; m

    , pro

    port

    ion

    of p

    olle

    n im

    mig

    rati

    ng fr

    om o

    utsi

    de o

    f the

    nei

    ghbo

    urho

    ods;

    β, t

    he e

    stim

    ate

    of d

    ispe

    rsal

    par

    amet

    er; Φ

    FT, i

    ntra

    clas

    s co

    rrel

    atio

    n of

    mal

    e ga

    met

    es w

    ithi

    n fe

    mal

    es;

    stan

    dard

    dev

    iati

    ons

    over

    rep

    licat

    es in

    par

    enth

    eses

    .

    (b) used in our simulations ranged from 0.1 to 1, whichcovers extensive to very restricted pollen movement. Forexample, when b = 0.1, the actual mean dispersal distance(X) is about 20 units, and nearly 90% of pollen gametes(m = 0.89) come from males located outside of thedesignated neighbourhood (i.e. further than 5.66 units). Incontrast, when b = 1.0, X = 2.26 and m = 0.026 (Table 1). Themean distance between mothers (¥ ≈ 117 units) was greaterthan 5X (= 100.25 units) for the most extensive pollendispersal simulated (b = 0.1). This allowed us to calculateNeTG directly from ΦFT using equation (4) (Austerlitz &Smouse 2001a). The estimates of β from the neighborhoodmodel were very close to b values we simulated (notsignificantly different). The variance of < increases as bdecreases because there a lower proportion of the localmatings occurs within the neighbourhood (1 – m), and thisis the basis for estimating β.

    The mean dispersal distance estimates (>NM and >TG)obtained through the neighborhood model and two-gener were very similar to each other and were close to theactual dispersal distances, X (Table 1). The rate of conver-gence of the estimates to the actual values increased as bincreased. Nonetheless, as estimates of >TG approached X(i.e. when b ≥ 0.5), the estimates of >NM were nearly identi-cal to X, even for b ≥ 0.2 (Table 1). Notably, the variances ofthe estimates increased as b decreased, and they werelarger for twogener than for the neighborhood model.

    Our estimates of the effective number of pollen parents(NeNM and NeTG) ranged from about 31, when b = 1.0, to acouple of thousands when b = 0.1. The values of NeTG weresimilar to expectations (Nep), whereas NeNM was significantlyoverestimated when b ≤ 0.2. Except for b ≤ 0.2, the values ofNeNM and NeTG were very close to one another and theexpected values (Nep). Nonetheless, the variances of theestimates were larger for twogener than for the neighbor-hood model. In addition, the variances of ΦFT were lowand relatively stable across a range of b values. Given thegenetic information content of our simulated data sets (i.e.1000 offspring, EP = 0.9996), the mating parameters arereasonably well estimated (i.e. low coefficients of vari-ation) in the neighborhood model when b ≥ 0.2, and intwogener when b ≥ 0.5.

    For any given b, our estimates of ΦFT (Table 1) aremuch lower than those reported by Smouse et al. (2001)(Fig. 3, therein). We suppose that these authors simulatedtheir data sets in only one dimension using equation (5), whichdoes not take into account that (at any given distance) thereis a certain number of individuals with the same matingprobability. In later publications, Smouse and coworkersemployed two-dimensional pollen distributions (Austerlitz& Smouse 2001a) that are comparable to our simulations.

    We repeated our simulations allowing for self-fertilization,the frequency of which equals the probability of pollen dis-persal at distance 0 (Table 2) (scenario II). Our simulated

  • P A R E N T A G E V S . T W O G E N E R 2531

    © 2005 Blackwell Publishing Ltd, Molecular Ecology, 14, 2525–2537

    Tab

    le 2

    The

    im

    pact

    of

    disp

    ersa

    l pa

    ram

    eter

    b o

    n ac

    tual

    mea

    n di

    sper

    sal

    dist

    ance

    X,

    expe

    cted

    eff

    ecti

    ve n

    umbe

    r of

    pol

    len

    pare

    nts

    Nep

    and

    mat

    ing

    para

    met

    ers

    obta

    ined

    thr

    ough

    the

    nei

    ghbo

    rho

    od

    and

    tw

    oge

    ner

    mod

    els,

    whe

    n se

    lf-f

    erti

    lizat

    ion

    is a

    llow

    ed b

    Nep

    X

    nei

    ghbo

    rho

    od

    two

    gen

    er

    sm

    βN

    eNM

    δ NM

    ΦFT

    NeT

    Gδ T

    G

    0.1

    2513

    .27

    20.0

    2 (0

    .48)

    0.00

    2 (0

    .001

    )0.

    889

    (0.0

    11)

    −0.1

    14 (0

    .062

    )80

    44.3

    7 (1

    823.

    91)

    27.3

    6 (2

    8.76

    )0.

    0002

    (0.0

    010)

    2340

    .46

    (244

    5.36

    )15

    .93

    (17.

    70)

    0.2

    628.

    329.

    99 (0

    .21)

    0.00

    7 (0

    .002

    )0.

    685

    (0.0

    16)

    −0.2

    03 (0

    .042

    )93

    5.89

    (96.

    62)

    10.3

    0 (2

    .28)

    0.00

    06 (0

    .001

    0)78

    9.91

    (136

    4.48

    )14

    .21

    (16.

    35)

    0.3

    279.

    256.

    68 (0

    .15)

    0.01

    3 (0

    .003

    )0.

    494

    (0.0

    17)

    −0.3

    02 (0

    .033

    )32

    9.33

    (24.

    62)

    6.71

    (0.7

    2)0.

    0018

    (0.0

    011)

    280.

    47 (2

    85.1

    6)8.

    45 (6

    .17)

    0.4

    157.

    085.

    02 (0

    .12)

    0.02

    2 (0

    .004

    )0.

    341

    (0.0

    17)

    −0.3

    98 (0

    .030

    )16

    9.29

    (11.

    06)

    5.06

    (0.3

    8)0.

    0030

    (0.0

    012)

    168.

    74 (8

    2.73

    )5.

    67 (1

    .78)

    0.5

    100.

    534.

    01 (0

    .07)

    0.03

    3 (0

    .005

    )0.

    226

    (0.0

    13)

    −0.4

    96 (0

    .025

    )10

    4.44

    (5.6

    6)4.

    04 (0

    .19)

    0.00

    49 (0

    .001

    3)10

    2.77

    (30.

    16)

    4.18

    (0.7

    4)0.

    669

    .81

    3.34

    (0.0

    7)0.

    046

    (0.0

    06)

    0.14

    7 (0

    .011

    )−0

    .595

    (0.0

    26)

    71.9

    9 (3

    .80)

    3.36

    (0.1

    4)0.

    0069

    (0.0

    014)

    72.1

    5 (1

    5.37

    )3.

    45 (0

    .38)

    0.7

    51.2

    92.

    87 (0

    .06)

    0.05

    9 (0

    .007

    )0.

    095

    (0.0

    10)

    −0.6

    94 (0

    .028

    )53

    .24

    (2.7

    1)2.

    88 (0

    .10)

    0.00

    93 (0

    .001

    6)53

    .61

    (9.3

    7)2.

    95 (0

    .27)

    0.8

    39.2

    72.

    51 (0

    .05)

    0.07

    6 (0

    .008

    )0.

    060

    (0.0

    07)

    −0.7

    96 (0

    .029

    )40

    .70

    (1.9

    6)2.

    51 (0

    .08)

    0.01

    23 (0

    .001

    7)40

    .79

    (5.7

    2)2.

    57 (0

    .18)

    0.9

    31.0

    32.

    23 (0

    .05)

    0.09

    3 (0

    .008

    )0.

    038

    (0.0

    06)

    −0.8

    95 (0

    .028

    )32

    .31

    (1.5

    8)2.

    22 (0

    .06)

    0.01

    51 (0

    .001

    9)33

    .05

    (4.2

    2)2.

    31 (0

    .15)

    1.0

    25.1

    32.

    01 (0

    .04)

    0.10

    9 (0

    .010

    )0.

    024

    (0.0

    05)

    −0.9

    91 (0

    .030

    )26

    .67

    (1.3

    5)2.

    00 (0

    .05)

    0.01

    85 (0

    .002

    0)27

    .10

    (2.9

    6)2.

    09 (0

    .11)

    Para

    met

    er d

    escr

    ipti

    ons

    as in

    Tab

    le 1

    ; s, t

    he e

    stim

    ate

    of s

    elf-

    fert

    iliza

    tion

    ; sta

    ndar

    d de

    viat

    ions

    ove

    r re

    plic

    ates

    in p

    aren

    thes

    es.

    rates of self-fertilization ranged from 0.002 (at b = 0.1) to0.109 (for b = 1.0). Therefore, the actual mean dispersaldistance (including the fact that self-fertilization occursat the distance 0) decreased faster with increasing b. Ourestimates of the effective number of pollen parents andmean dispersal distance also decreased, and convergedclose to the expected values as b decreased. The estimatesof NeNM were significantly overestimated when b ≤ 0.2(Table 2). However, the accuracy of δ estimates wasgreater, and the variances of Ne and δ lower for the neigh-borhood model.

    Impact of self-fertilization

    To investigate the effect of self-fertilization upon theparameter estimates, we simulated data sets with b set to0.5, and allowed for s to vary from 0 to 0.5 at an interval of0.1 (scenario III). These analyses show that ΦFT increasesdramatically as s increases (Table 3). The difference in ΦFTbetween s = 0 and s = 0.5 was 25-fold. This resulted in acorresponding decrease in the values of NeTG. Nevertheless,the estimates of the effective number of pollen parents areclose to the expected values and similar between theneighborhood model and twogener, although twogenerproduced higher variances (Table 3). Although the neigh-borhood model provided unbiased estimates of δNM,twogener gave estimates of δTG that were biased down-wards. This is not surprising because it is inappropriateto use equation (11) when the levels of simulated self-fertilization exceed the level of selfing expected from thedispersal curve at b = 0.5.

    Because both s and b have a strong influence on ΦFT, wefurther investigated their joint impact on the estimates ofΦFT. We generated data samples for a combination of b(ranging from 0.1 to 1.0 at an interval of 0.1) and s (from 0to 0.5 at an interval of 0.05). We found that self-fertilizationhas a much stronger effect upon ΦFT than did the mode ofpollen dispersal. ΦFT increased nearly exponentially as sincreased (Fig. 1a). The proportional impact of selfingseems to be greater when b is small. Interestingly, the samevalues of ΦFT may be produced from very different matingpatterns. For example, ΦFT was about 0.046 when bequalled 0.5 and s equalled 0, but also when b equalled 0.1and s equalled 0.10. Nonetheless, the estimates of pollendispersal under these scenarios were quite different (4.15and 17.98, respectively).

    Impact of long-distance pollen dispersal

    In other simulations, we permitted long-distance pollendispersal (LDPD) that does not follow the exponentialdistribution and whose origin is unknown (scenario IV).Assuming no selfing, we generated data samples for acombination of b (ranging from 0.1 to 1.0 at an interval of

  • 2532 J . B U R C Z Y K and T . E . K O R A L E W S K I

    © 2005 Blackwell Publishing Ltd, Molecular Ecology, 14, 2525–2537

    Tab

    le 3

    The

    im

    pact

    of

    self

    -fer

    tiliz

    atio

    n (S

    ) on

    act

    ual

    mea

    n di

    sper

    sal

    dist

    ance

    X,

    expe

    cted

    eff

    ecti

    ve n

    umbe

    r of

    pol

    len

    pare

    nts

    Nep

    and

    mat

    ing

    para

    met

    ers

    obta

    ined

    thr

    ough

    the

    nei

    ghbo

    rho

    od

    and

    tw

    oge

    ner

    mod

    els,

    whe

    n di

    sper

    sal p

    aram

    eter

    b =

    0.5

    SN

    epX

    nei

    ghbo

    rho

    od

    two

    gen

    er

    sm

    βN

    eNM

    δ NΜ

    ΦFT

    NeT

    Gδ T

    G

    0.00

    114.

    494.

    15 (0

    .08)

    —0.

    234

    (0.0

    13)

    −0.4

    98 (0

    .026

    )11

    0.29

    (5.7

    5)4.

    15 (0

    .22)

    0.00

    47 (0

    .001

    2)10

    6.39

    (29.

    49)

    4.23

    (0.5

    9)0.

    1058

    .56

    3.73

    (0.0

    8)0.

    1000

    (0.0

    005)

    0.21

    1 (0

    .013

    )−0

    .498

    (0.0

    32)

    57.6

    3 (1

    .50)

    3.75

    (0.2

    5)0.

    0079

    (0.0

    014)

    63.3

    2 (1

    1.33

    )3.

    21 (0

    .29)

    0.20

    21.9

    33.

    33 (0

    .08)

    0.20

    00 (0

    .000

    5)0.

    188

    (0.0

    11)

    −0.4

    96 (0

    .032

    )21

    .84

    (0.2

    0)3.

    35 (0

    .21)

    0.02

    12 (0

    .001

    8)23

    .57

    (1.9

    8)1.

    94 (0

    .08)

    0.30

    10.6

    12.

    90 (0

    .08)

    0.29

    99 (0

    .000

    4)0.

    164

    (0.0

    12)

    −0.5

    02 (0

    .032

    )10

    .58

    (0.0

    4)2.

    90 (0

    .17)

    0.04

    52 (0

    .002

    8)11

    .07

    (0.6

    8)1.

    33 (0

    .04)

    0.40

    6.13

    2.48

    (0.0

    7)0.

    4000

    (0.0

    004)

    0.13

    9 (0

    .010

    )−0

    .493

    (0.0

    34)

    6.12

    (0.0

    2)2.

    53 (0

    .16)

    0.07

    92 (0

    .003

    8)6.

    31 (0

    .30)

    1.00

    (0.0

    2)0.

    503.

    962.

    08 (0

    .06)

    0.49

    99 (0

    .000

    3)0.

    117

    (0.0

    09)

    −0.4

    95 (0

    .038

    )3.

    96 (0

    .01)

    2.10

    (0.1

    5)0.

    1238

    (0.0

    050)

    4.04

    (0.1

    6)0.

    80 (0

    .02)

    Para

    met

    er d

    escr

    ipti

    ons

    as in

    Tab

    les

    1 an

    d 2;

    sta

    ndar

    d d

    evia

    tion

    s ov

    er r

    eplic

    ates

    in p

    aren

    thes

    es.

    0.1) and M (from 0 to 0.5, at an interval of 0.05) (Fig. 2a). Theimpact of M upon ΦFT is evident, and leads to lowerestimates of ΦFT. Therefore, if LDPD occurs, this will resultin lower estimates of ΦFT and increased estimates of Ne.In any case, if LDPD occurs, the mean dispersal distancecannot be estimated reliably using either twogener or theneighborhood model.

    We also generated data samples for a combination of sand M (both ranging from 0 to 0.45 at an interval of 0.05)with a constant b (= 0.1; Fig. 2b). Under these conditions,LDPD has a minor effect on ΦFT, as compared to the effectof selfing. Nevertheless, when we included either selfing orLDPD in our simulations, the neighborhood model andtwogener both provided estimates of the effective numberof pollen parents that were not significantly different fromthe expected values (Nep) (Tables 3 and 4).

    Discussion

    Previous applications of twogener focused on theeffective number of pollen parents and resulting effectivepollen dispersal distance (Smouse et al. 2001; Sork et al.2002a; Austerlitz et al. 2004). In contrast, paternity-basedanalyses emphasized the contribution of long-distancepollen flow and mean pollen dispersal distance (Burczyket al. 1996; Chase et al. 1996; Streiff et al. 1999; Lian et al. 2001).Whereas the first perspective is important for evolutionarygeneticists, the second is more familiar to plant biologists,conservationists, and plant breeders.

    Despite their differences, the neighborhood and two-gener models provide comparable estimates of effectivenumber of pollen parents. Furthermore, they may providecomparable estimates of mean pollen dispersal distance,given the assumptions on modes of pollen dispersal areappropriate. The neighborhood model provided lowervariances of the parameter estimates, but more potentialpollen parents would need to be genotyped (greatergenetic information content). Under such sampling,maximum-likelihood methods should always have smallervariances for a given parameter combination.

    Estimates of NeNM are biased upwards when pollendispersal is extensive (b ≤ 0.2, Tables 1 and 2) becausepollen contributions from outside of the neighbourhoodsare ignored in equation (14). Equation (14) is based onthe assumption that background pollination occursrandomly with an infinite number of potential pollenparents. Therefore, the values derived from equation (14)might be considered an upper bound of the effectivenumber of pollen parents (Burczyk et al. 1996). Neverthe-less, because equation (14) appears to be accurate forestimating the effective numbers reduced due to selfingor restricted pollen dispersal, it should be appropriatewhen precise estimates are needed (i.e. genetic conserva-tion programs).

  • P A R E N T A G E V S . T W O G E N E R 2533

    © 2005 Blackwell Publishing Ltd, Molecular Ecology, 14, 2525–2537

    Impact of self-fertilization

    Self-fertilization might affect the estimates of ΦFT (Austerlitz& Smouse 2001a, b) because selfing strongly affects theprobability of identity by descent (IBD). The need tointegrate selfing into the estimation of ΦFT has already beenemphasized (Austerlitz et al. 2004; Smouse & Sork 2004).In practice, self-fertilization might be only weakly relatedto the probability of pollination at the distance 0. Selfing

    in different taxa depends on a number of ecological andgenetic determinants, including population density, theavailability of self-pollen, synchronization between maleand female flowering, and self-fertility (Schnabel 1998;Boshier 2000). Although selfing might be reduced throughthe post-zygotic selection, several authors demonstratedthat selfing varies among individuals within populations.Therefore, we further explored the impact of selfing levelon the parameter estimates produced by both models.

    Fig. 1 Impact of dispersal parameter and self-fertilization upon the total intraclass correlation, ΦFT (a), and the intraclass correlationresulted from outcrossing, ( b).′ΦFT

    Fig. 2 The impact of long-distance pollen dispersal and dispersal parameter (a) and selfing (b) upon the total intraclass correlation, ΦFT.Note the scale difference between the two graphs.

  • 2534 J . B U R C Z Y K and T . E . K O R A L E W S K I

    © 2005 Blackwell Publishing Ltd, Molecular Ecology, 14, 2525–2537

    It would be interesting to determine the fraction of ΦFTthat results from outcross matings. The data sets used forthe twogener method can be readily used to estimate self-ing (Ritland 2002). These estimates might then be used toadjust the estimates of ΦFT so they reflect outcross matingsonly. We may use the reasoning outlined by Burczyk et al.(1996). The effective number of pollen parents can beestimated as according to Crow & Kimura (1970):

    (eqn 17)

    where φj is the contribution of the j-th pollen parent to theoffspring. Therefore is the probability that two gametescome from the j-th father, and Σ φj = 1. Let one of the φjdenote the proportion of selfing (s), and the remaining φjparameters denote outcrossing: , where isan adjusted outcrossing contribution that still sum to unity

    . Separating selfing from outcrossing, thedenominator of equation (17) is

    (eqn 18)

    Since for twogener Ne = 1/(2ΦFT), we may relate the twodenominators:

    (eqn 19)

    The last component on the far right-hand side ( )might be used in equation (17) to estimate the Ne that resultsonly from outcrossing. Thus, by analogy: ,where is the intraclass correlation of outcross matings.Substituting it to equation (19), we may calculate basedon ΦFT and selfing:

    (eqn 20)

    which might be then used to approximate the mean pollendispersal for outcross mating, using various types of

    dispersal functions (Austerlitz & Smouse 2001a; Austerlitzet al. 2004). This parameter is comparable to the correlationof paternity, as presented by Ritland (1989).

    We used equation (20) to estimate for simulated datasets shown on Fig. 1a. Then, the relationship between and b appeared to be uniform across a range of selfinglevels, and, in fact, it always approximated the relationshipas when s = 0 (Fig. 1b). Although the variance of isslightly larger with higher s values, it is primarily causedby the reduced proportion of outcrossed offspring (1 – s)that is used for estimating ΦFT.

    Self-fertilization rates observed in the seed or seedlingstage might be reduced at later life stages (Williams &Savolainen 1996). In these cases, the estimates of pollen dis-persal based on seed samples might underestimate theeffective pollen dispersal that would reflect the populationstructure at later life stages. For these reasons, it is valuableto estimate pollen dispersal distance based on outcrossmatings only. Although this is an inherent feature of theneighborhood model, it can also be done using twogener.

    Robledo-Arnuncio et al. (2004) in one of the studied Scotspine populations found s = 0.068, while ΦFT = 0.004 (andNe = 125). Based on equation (20), we can estimate =0.00194, which gives the effective number of outcrossingpollen parents equal to = 257, double the estimate of Ne.A contrasting example would be that by Sork et al. (2002a),who investigated pollen movement in a population ofQuercus lobata. They estimated ΦFT = 0.136, which translatesto Ne = 3.68 individuals. The selfing level observed in theseed sample was low, s = 0.04 (Sork et al. 2002b). Usingequation (20), we can calculate the that results solelyfrom outcrossing. In this case, it is even lower than ΦFT( = 0.1467, = 3.41). This is not surprising becausewe expect an equilibrium stage to exist:

    (eqn 21)

    Substituting with equation (20), we see that the equi-librium is attained when

    Table 4 The impact of selfing (s) and long-distance pollen dispersal (M) on expected effective number of pollen parents Nep and theestimates of mating parameters obtained through the neighborhood and twotener models, when dispersal parameter b = 0.5

    S M Nep

    neighborhood twogener

    m β NeNM ΦFT NeTG

    0.00 0.00 114.49 0.234 (0.013) −0.498 (0.026) 110.29 (5.75) 0.0047 (0.0012) 106.39 (29.49)0.10 0.10 64.14 0.289 (0.013) −0.494 (0.029) 63.43 (1.48) 0.0071 (0.0014) 70.45 (13.97)0.10 0.30 76.08 0.440 (0.011) −0.501 (0.035) 75.21 (1.45) 0.0057 (0.0012) 87.54 (19.30)0.30 0.10 10.74 0.240 (0.010) −0.498 (0.035) 10.72 (0.04) 0.0445 (0.0026) 11.23 (0.66)0.30 0.30 10.94 0.394 (0.009) −0.497 (0.036) 10.93 (0.03) 0.0436 (0.0026) 11.46 (0.70)

    Parameter descriptions as in Table 1; standard deviations over replicates in parentheses.

    Ne jj

    J

    ==∑1 2

    1

    φ

    φ j2

    φ φj js ( )= − ′1 ′φ j

    ( )Σ ′ =φ j 1

    φ φ φjj

    J

    jj

    J

    jj

    J

    s s s s21

    2

    1

    12 2 2 2

    1

    1

    1 1= =

    =

    ∑ ∑ ∑= + − ′ = + − ′ [( ) ] ( )

    2 12 2 21

    1

    ΦFT ( )= + − ′=

    ∑s s jj

    J

    φ

    ∑ ′=−

    jJ

    j11 2φ

    2 11 2′ = ∑ ′=

    −ΦFT jJ

    jφ′ΦFT

    ′ΦFT

    ′ =−

    −Φ

    ΦFT

    FT

    ( )22 1

    2

    2

    ss

    ′ΦFT′ΦFT

    ′ΦFT

    ′ΦFT

    ′N e

    ′ΦFT

    ′ΦFT ′N e

    ∆Φ Φ ΦFT FT FT = − ′ = 0

    ′ΦFT

  • P A R E N T A G E V S . T W O G E N E R 2535

    © 2005 Blackwell Publishing Ltd, Molecular Ecology, 14, 2525–2537

    (eqn 22)

    The far right-hand side of the equation equals the inbreedingcoefficient under equilibrium and partial selfing (Hedrick1999; p. 190). ∆ΦFT will be greater than zero (and greaterthan Ne) when 2ΦFT < s/(2 – s), as in the example ofRobledo-Arnuncio et al. (2004). Otherwise, when 2ΦFT > s/(2 – s), ∆ΦFT will be negative, leading to a decrease in (asin Sork et al. 2002a, b). The impact of ΦFT and s upon ∆ΦFTis demonstrated in Fig. 3.

    Impact of long-distance pollen dispersal

    Several empirical studies demonstrated that the shape ofthe exponential pollen dispersal curve in local populationscannot explain the high levels of pollen immigration fromoutside (Burczyk et al. 1996; Dow & Ashley 1998; Streiffet al. 1999; Oddou-Muratorio et al. 2003). Pollen dispersalby wind is a complex phenomenon (Di-Giovanni & Kevan1991). Although some proportion of the pollen dispersedwithin populations might be described by mathematicallysimple dispersal curves, some pollen might be dispersedover large distances at random. In large continuous popu-lations of wind-pollinated species (such as temperate foresttrees), pollen movement is partly driven by turbulence thatlifts pollen over the canopy of a population, where mildwinds are then able to carry pollen over large distances(Di-Giovanni & Kevan 1991; Lindgren et al. 1995). Examplesof long and intensive pollen dispersal are also known

    among animal-pollinated trees (Chase et al. 1996; Nason &Hamrick 1997; Konuma et al. 2000).

    Let’s assume that a considerable proportion of the pollencomes from a very restricted number of fathers, and therest of the pollen comes from a potentially infinite pool offathers. Based on equation (17), this situation will lead torelatively low effective numbers of fathers anyway. Forexample, let five fathers contribute to the next generationequally (i.e. φj = 0.1 for pollen parent), and let the rest ofpollen (50%) come from an infinite number of fathers. Basedon equation (17), the resulting Ne cannot be greater than 20.If 50% of the pollen results from LDPD, then how informativeis Ne or ΦFT for estimating the extent of pollen movement?

    Both Ne and ΦFT are composite summary statistics, whichcan be the same for very different mating scenarios. Look-ing at estimates of ΦFT, we do not know the reason for highor low intraclass correlation. It could be selfing, correlatedpaternity, or LDPD. We can easily imagine that there mightbe data examples with selfing s = 0.2 and some proportionof pollen coming from a great distance (say M = 0.40). ThenΦFT ≈ 0.02 (Fig. 2b), which is equivalent to Ne = 25. In ourvirtual experiment, the same values of ΦFT and Ne may beobtained when there is no selfing and no LDPD (s = M = 0)and when b = 1.15; a pattern of a very restricted pollen dis-persal. Ne is a parameter of general interest and utility fordemography and genetics of populations, and it is usefulin a number of theoretical and practical considerations.However, the estimates of ΦFT or Ne alone are inadequateto describe the details of mating patterns and gene flow.The difficulties arise from estimating the average dispersaldistance from ΦFT, when factors like selfing, LDPD, oreffective density impose problems. LDPD will certainlypose a problem for estimating the average distance ofpollen dispersal, not only in twogener, but also the neigh-borhood model.

    We compared the simplest versions of the two methodsunder optimum assumptions: categorical assignment, largeoffspring samples, highly informative markers, lack ofgenotyping problems, uniform male fertility, and an expon-ential dispersal function. In the real world, the two methodsmight behave differently. Genotyping problems mightinflate the levels of long-distance pollen dispersal as esti-mated in the neighborhood model (Burczyk et al. 2004a),but it might have a minor effect upon ΦFT. Variable malefertility reduces the ‘effective density’ (sensu Austerlitz &Smouse 2002) of pollen donors, which in turn affects theestimates of mean dispersal distance. Variable flowerphenology may further increase differentiation of pollen poolsamong females, leading to decreased estimates of Ne andunderestimates of δ. Recent developments of twogenerallow one to estimate the shape parameter of the dispersalfunction jointly with the ‘effective density’ (Austerlitz et al.2004). In the case of Quercus lobata, this yielded a muchlarger average pollination distance (> = 300 m) compared

    22

    ΦFT =

    −s

    s

    Fig. 3 The impact of total intraclass correlation, ΦFT, and selfingupon the difference between the total and outcross intraclasscorrelations ∆ΦFT (= ΦFT – ). The shaded area indicates ∆ΦFT < 0.′ΦFT

    ′Ne

    ′Ne

  • 2536 J . B U R C Z Y K and T . E . K O R A L E W S K I

    © 2005 Blackwell Publishing Ltd, Molecular Ecology, 14, 2525–2537

    to the case in which the density was estimated from popu-lation counts (> ≅ 100 ÷ 120 m) (Sork et al. 2002a; Austerlitzet al. 2004). On the other hand, some of the male fertilitycovariates can be incorporated into the neighborhoodmodel to account for the variable male fertilities (Burczyket al. 1996, 2002; Burczyk & Prat 1997).

    The distribution of pollen dispersal is also generallyunknown, but the shape of the pollen dispersal functionhas to be assumed for either model. However, as indicatedby Austerlitz et al. (2004), testing various dispersal func-tions is feasible in twogener, but the mating models havethe advantage of making possible the evaluation of fit ofdata to resulting log-likelihoods, which is not possible intwogener. Some of the assumptions mentioned above,along with ambiguous genetic assignment, could be testedin future using computer simulations.

    Acknowledgements

    We are grateful to Igor Chybicki, Glenn Howe, Peter Smouse andJuan Jose Robledo-Arnuncio for helpful discussions, commentsand suggestions that improved the manuscript.

    ReferencesAdams WT (1992) Gene dispersal within forest tree populations.

    New Forests, 6, 217–240.Adams WT, Birkes DS (1991) Estimating mating patterns in forest

    tree populations. In: Biochemical Markers in the PopulationGenetics of Forest Trees (eds Fineschi S, Malvolti ME, Cannata F,Hattemer HH), pp. 157–172. SPB Academic Publishing, TheHague, The Netherlands.

    Adams WT, Burczyk J (2000) Magnitude and implications of geneflow in gene conservation reserves. In: Forest ConservationGenetics: Principles and Practice (eds Young A, Boshier D, BoyleT), pp. 215–224. CSIRO Publishing, Collingwood, Australia.

    Austerlitz F, Smouse PE (2001a) Two-generation analysis of pollenflow across a landscape. II. Relation between ΦFT, pollen disper-sal and interfemale distance. Genetics, 157, 851–857.

    Austerlitz F, Smouse PE (2001b) Two-generation analysis of pollenflow across a landscape. III. Impact of adult population struc-ture. Genetic Research, 78, 271–280.

    Austerlitz F, Smouse PE (2002) Two-generation analysis of pollenflow across a landscape. IV. Estimating the dispersal parameter.Genetics, 161, 355–363.

    Austerlitz F, Dick CW, Dutech C et al. (2004) Using geneticmarkers to estimate the pollen dispersal curve. Molecular Ecology,13, 937–954.

    Boshier DH (2000) Mating systems. In: Forest Conservation Genetics:Principles and Practice (eds Young A, Boshier D, Boyle T), pp. 63–79. CSIRO Publishing, Collingwood, Australia.

    Burczyk J, Prat D (1997) Male reproductive success in Pseudotsugamenziesii (Mirb.) Franco: the effect of spatial structure andflowering characteristics. Heredity, 79, 638–647.

    Burczyk J, Chybicki IJ (2004) Cautions on direct gene flow estima-tion in plant populations. Evolution, 58, 956–963.

    Burczyk J, Adams WT, Shimizu JY (1996) Mating patterns andpollen dispersal in a natural knobcone pine (Pinus attenuataLemmon.) stand. Heredity, 77, 251–260.

    Burczyk J, Adams WT, Moran GF, Griffin AR (2002) Complexpatterns of mating revealed in a Eucalyptus regnans seed orchardusing allozyme markers and the neighbourhood model.Molecular Ecology, 11, 2379–2391.

    Burczyk J, DiFazio SP, Adams WT (2004a) Gene flow in foresttrees: how far do genes really travel ? Forest Genetics, 11, 179–192.

    Burczyk J, Lewandowski A, Chalupka W (2004b) Local pollen dis-persal and distant gene flow in Norway spruce (Picea abies [L.]Karst.). Forest Ecology and Management, 197, 39–48.

    Chase M, Kesseli R, Bawa K (1996) Microsatellite markers forpopulation and conservation genetics of tropical trees. AmericanJournal of Botany, 83, 51–57.

    Crow JF, Kimura M (1970) An Introduction to Population GeneticsTheory. Harper and Row, New York.

    Devlin B, Ellstrand NC (1990) The development and application ofa refined method for estimating gene flow angiospermpaternity analysis. Evolution, 44, 248–259.

    Devlin B, Roeder K, Ellstrand NC (1988) Fractional paternityassignment: theoretical development and comparison to othermethods. Theoretical and Applied Genetics, 76, 369–380.

    DiFazio SP, Slavov GT, Burczyk J, Leonardi S, Strauss SH (2004)Gene flow from tree plantations and implications for transgenicrisk assessment. In: Plantation Forest Biotechnology for the 21stCentury (eds Walter C, Carson M), pp. 405–422. Research Sign-post, Kerala, India.

    Di-Giovanni F, Kevan PG (1991) Factors affecting pollen dynamicsand its importance to pollen contamination: a review. CanadianJournal of Forest Research, 21, 1155–1170.

    Di-Giovanni F, Kevan PG, Arnold J (1996) Lower planetaryboundary layer profiles of atmospheric conifer pollen above aseed orchard in northern Ontario, Canada. Forest Ecology andManagement, 83, 87–97.

    Dow BD, Ashley MV (1998) High levels of gene flow in bur oakrevealed by paternity analysis using microsatellites. Journal ofHeredity, 89, 62–70.

    Dyer RJ, Sork VL (2001) Pollen pool heterogeneity in shortleafpine, Pinus echinata Mill. Molecular Ecology, 10, 859–866.

    Dyer RJ, Westfall RD, Sork VL, Smouse PE (2004) Two-generationanalysis of pollen flow across a landscape V: a stepwiseapproach for extracting factors contributing to pollen structure.Heredity, 92, 204–211.

    Ellstrand NC (1992) Gene flow by pollen: implications for plantconservation genetics. Oikos, 63, 77–86.

    Excoffier L, Smouse PE, Quattro JM (1992) Analysis of molecularvariance inferred from metric distances among DNA haplo-types: application to human mitochondrial DNA restrictiondata. Genetics, 131, 479–491.

    Hamrick JL, Nason JD (2000) Gene flow in forest trees. In: ForestConservation Genetics: Principles and Practice (eds Young A,Boshier D, Boyle T), pp. 81–90. CSIRO Publishing, Colling-wood, Australia.

    Hedrick PW (1999) Genetics of Populations. Jones and BartlettPublishers, Boston.

    Hjelmroos M (1991) Evidence of long-distance transport of Betulapollen. Grana, 30, 215–228.

    Konuma A, Tsumura Y, Lee CT, Lee SL, Okuda T (2000) Estima-tion of gene flow in the tropical-rainforest tree Neobalanocarpusheimii (Dipterocarpaceae), inferred from paternity analysis.Molecular Ecology, 9, 1843–1852.

    Latouche-Halle C, Ramboier A, Bandou E, Caron H, Kremer A(2004) Long-distance pollen flow and tolerance to selfing in aNeotropical tree species. Molecular Ecology, 13, 1055–1226.

  • P A R E N T A G E V S . T W O G E N E R 2537

    © 2005 Blackwell Publishing Ltd, Molecular Ecology, 14, 2525–2537

    Lenormand T (2002) Gene flow and the limits to natural selection.Trends in Ecology & Evolution, 17, 183–189.

    Lian CL, Miwa M, Hogetsu T (2001) Outcrossing and paternityanalysis of Pinus densiflora ( Japanese red pine) by microsatellitepolymorphism. Heredity, 87, 88–98.

    Lindgren D, Paule L, Shen XH et al. (1995) Can viable pollen carryScots pine genes over long distances ? Grana, 34, 64–69.

    Nason JD, Hamrick JL (1997) Reproductive and genetic conse-quences of forest fragmentation: two case studies of Neotropicalcanopy trees. Journal of Heredity, 88, 264–276.

    Nason JD, Herre EA, Hamrick JL (1998) The breeding structure ofa tropical keystone plant resource. Nature, 391, 685–687.

    Oddou-Muratorio S, Houot M-L, Demesure-Musch B, Austerlitz F(2003) Pollen flow in the wildservice tree, Sorbus torminalis (L.)Crantz. I. Evaluating the paternity analysis procedure in contin-uous populations. Molecular Ecology, 12, 3427–3439.

    Rieseberg LH, Burke JM (2001) The biological reality of species:gene flow, selection, and collective evolution. Taxon, 50, 47–67.

    Ritland K (1989) Correlated matings in the partial selfer Mimulusguttatus. Evolution, 43, 848–859.

    Ritland K (2002) Extensions of models for the estimation of matingsystems using n independent loci. Heredity, 88, 221–228.

    Robledo-Arnuncio JJ, Smouse PE, Gil L, Alia R (2004) Pollenmovement under alternative silvicultural practices in nativepopulations of Scots pine (Pinus sylvestris L.) in central Spain.Forest Ecology and Management, 197, 245–255.

    Rogers CA, Levetin E (1998) Evidence of long-distance transportof mountain cedar pollen into Tulsa, Oklahoma. InternationalJournal of Biometeorology, 42, 65–72.

    Schnabel A (1998) Parentage analysis in plants: mating systems,gene flow, and relative fertilities. In: Advances in MolecularEcology (ed. Carvalho GR), pp. 173–189. IOS Press, Amsterdam,The Netherlands.

    Smith DB, Adams WT (1983) Measuring pollen contamination inclonal seed orchards with the aid of genetic markers. In: Proceed-ings of the 20th Southern Forest Tree Improvement Conference,pp. 69–77, University of Georgia, Athens, Greece.

    Smouse PE, Sork VL (2004) Measuring pollen flow in forest trees:

    an exposition of alternative approaches. Forest Ecology andManagement, 197, 21–38.

    Smouse PE, Meagher T, Kobak C (1999) Parentage analysis inChamaelirium luteum (L.) Gray (Liliaceae): why do some maleshave higher reproductive contributions? Journal of EvolutionaryBiology, 12 (6), 1069–1077.

    Smouse PE, Dyer RJ, Westfall RD, Sork VL (2001) Two-generationanalysis of pollen flow across a landscape. I. Male gameteheterogeneity among females. Evolution, 55, 260–271.

    Sork VL, Nason J, Campbell DR, Fernandez JF (1999) Landscapeapproaches to historical and contemporary gene flow in plants.Trends in Ecology & Evolution, 14 (6), 219–224.

    Sork VL, Davis F, Smouse PE et al. (2002a) Pollen movement indeclining populations of California valley oak, Quercus lobata:where have all the fathers gone? Molecular Ecology, 11, 1657–1668.

    Sork VL, Dyer RJ, Davis FW, Smouse PE (2002b) Mating patternsin a savanna population of valley oak (Quercus lobata Neé). In:Proceedings of the Fifth Symposium on Oak Woodlands: Oaks inCalifornia’s Changing Landscape, October 22–25, 2001 (eds StandifordR, McCreary D, Purcell KL), pp. 427–439. Pacific SW ResearchStation, US Forest Service, USDA, San Diego, California.

    Streiff R, Ducousso A, Lexer C, Steinkellner H, Gloessl J, Kremer A(1999) Pollen dispersal inferred from paternity analysis in amixed oak stand of Quercus robur L. and Q. petraea (Matt.) Liebl.Molecular Ecology, 8, 831–841.

    Williams CG, Savolainen O (1996) Inbreeding depression in conifers:implications for breeding strategy. Forest Science, 42, 102–117.

    Jarek Burczyk is particularly interested in estimating andmodelling mating patterns and gene flow in natural and breedingplant populations. Tomasz Koralewski has interests in creatingand using computer simulation means to address and elucidatebiological (population genetics in particular) issues. He also wrotethe nzg computer program. Anyone wishing to contact himshould email: [email protected]

  • 本文献由“学霸图书馆-文献云下载”收集自网络,仅供学习交流使用。

    学霸图书馆(www.xuebalib.com)是一个“整合众多图书馆数据库资源,

    提供一站式文献检索和下载服务”的24 小时在线不限IP

    图书馆。

    图书馆致力于便利、促进学习与科研,提供最强文献下载服务。

    图书馆导航:

    图书馆首页 文献云下载 图书馆入口 外文数据库大全 疑难文献辅助工具

    http://www.xuebalib.com/cloud/http://www.xuebalib.com/http://www.xuebalib.com/cloud/http://www.xuebalib.com/http://www.xuebalib.com/vip.htmlhttp://www.xuebalib.com/db.phphttp://www.xuebalib.com/zixun/2014-08-15/44.htmlhttp://www.xuebalib.com/

    Parentage versus two-generation analyses for estimating pollen-mediated gene flow in plant populations.学霸图书馆link:学霸图书馆