Relatedness and differentiation in arbitrary population ......0.3 Individuals Kinship Ournewkinship...

10
Relatedness and differentiation in arbitrary population structures Alejandro Ochoa, StatGen Center, Duke University with John D. Storey, Princeton University DrAlexOchoa ochoalab.github.io [email protected]

Transcript of Relatedness and differentiation in arbitrary population ......0.3 Individuals Kinship Ournewkinship...

Page 1: Relatedness and differentiation in arbitrary population ......0.3 Individuals Kinship Ournewkinship estimates Genotypesfrom“HumanOrigins” (Lazaridiset al. 2014,2016; Skoglundet

●●

●●

●●

●●

●●

●●

●●

●●

●● ●●

●●

●●

●●

●●

●●

●●

●●

●●● ●

●●

●●

●●

●●

●● ●

●●

●●

●●

●●

● ●

● ● ●

●●

●●

●●

● ●

●●

●●●

●●

●●

●●

●● ●

●●

● ●

0.2

0.3

0.4

Inbr

eedi

ng c

oeff.

Relatedness and differentiationin arbitrary population structuresAlejandro Ochoa, StatGen Center, Duke University

with John D. Storey, Princeton University7 DrAlexOchoa � ochoalab.github.io R [email protected]

Page 2: Relatedness and differentiation in arbitrary population ......0.3 Individuals Kinship Ournewkinship estimates Genotypesfrom“HumanOrigins” (Lazaridiset al. 2014,2016; Skoglundet

New kinship estimator for arbitrary population structuresxij ∈ {0, 1, 2} : Genotype at locus i of individual j .

Model:

E[xij |T ] = 2pi , Cov(xij , xik |T ) = 4pi (1− pi)ϕjk .

Standard estimator is biased:

p̂i =12n

n∑j=1

xij , ϕ̂stdjk =

m∑i=1

(xij − 2p̂i) (xik − 2p̂i)

4m∑i=1

p̂i (1− p̂i)

a.s.−−−→m→∞

ϕjk − ϕ̄j − ϕ̄k + ϕ̄

1− ϕ̄.

popkin: first unbiased kinship estimator! — R package on CRAN

Ajk =1m

m∑i=1

(xij − 1)(xik − 1)− 1, Amin = minu 6=v

1|Su||Sv |

∑j∈Su

∑k∈Sv

Ajk ,

ϕ̂jk = 1− Ajk

Amin

a.s.−−−→m→∞

ϕjk .

2 / 7

Page 3: Relatedness and differentiation in arbitrary population ......0.3 Individuals Kinship Ournewkinship estimates Genotypesfrom“HumanOrigins” (Lazaridiset al. 2014,2016; Skoglundet

New kinship estimator for arbitrary population structuresxij ∈ {0, 1, 2} : Genotype at locus i of individual j . Model:

E[xij |T ] = 2pi , Cov(xij , xik |T ) = 4pi (1− pi)ϕjk .

Standard estimator is biased:

p̂i =12n

n∑j=1

xij , ϕ̂stdjk =

m∑i=1

(xij − 2p̂i) (xik − 2p̂i)

4m∑i=1

p̂i (1− p̂i)

a.s.−−−→m→∞

ϕjk − ϕ̄j − ϕ̄k + ϕ̄

1− ϕ̄.

popkin: first unbiased kinship estimator! — R package on CRAN

Ajk =1m

m∑i=1

(xij − 1)(xik − 1)− 1, Amin = minu 6=v

1|Su||Sv |

∑j∈Su

∑k∈Sv

Ajk ,

ϕ̂jk = 1− Ajk

Amin

a.s.−−−→m→∞

ϕjk .

2 / 7

Page 4: Relatedness and differentiation in arbitrary population ......0.3 Individuals Kinship Ournewkinship estimates Genotypesfrom“HumanOrigins” (Lazaridiset al. 2014,2016; Skoglundet

New kinship estimator for arbitrary population structuresxij ∈ {0, 1, 2} : Genotype at locus i of individual j . Model:

E[xij |T ] = 2pi , Cov(xij , xik |T ) = 4pi (1− pi)ϕjk .

Standard estimator is biased:

p̂i =12n

n∑j=1

xij , ϕ̂stdjk =

m∑i=1

(xij − 2p̂i) (xik − 2p̂i)

4m∑i=1

p̂i (1− p̂i)

a.s.−−−→m→∞

ϕjk − ϕ̄j − ϕ̄k + ϕ̄

1− ϕ̄.

popkin: first unbiased kinship estimator! — R package on CRAN

Ajk =1m

m∑i=1

(xij − 1)(xik − 1)− 1, Amin = minu 6=v

1|Su||Sv |

∑j∈Su

∑k∈Sv

Ajk ,

ϕ̂jk = 1− Ajk

Amin

a.s.−−−→m→∞

ϕjk .

2 / 7

Page 5: Relatedness and differentiation in arbitrary population ......0.3 Individuals Kinship Ournewkinship estimates Genotypesfrom“HumanOrigins” (Lazaridiset al. 2014,2016; Skoglundet

New kinship estimator for arbitrary population structuresxij ∈ {0, 1, 2} : Genotype at locus i of individual j . Model:

E[xij |T ] = 2pi , Cov(xij , xik |T ) = 4pi (1− pi)ϕjk .

Standard estimator is biased:

p̂i =12n

n∑j=1

xij , ϕ̂stdjk =

m∑i=1

(xij − 2p̂i) (xik − 2p̂i)

4m∑i=1

p̂i (1− p̂i)

a.s.−−−→m→∞

ϕjk − ϕ̄j − ϕ̄k + ϕ̄

1− ϕ̄.

popkin: first unbiased kinship estimator! — R package on CRAN

Ajk =1m

m∑i=1

(xij − 1)(xik − 1)− 1, Amin = minu 6=v

1|Su||Sv |

∑j∈Su

∑k∈Sv

Ajk ,

ϕ̂jk = 1− Ajk

Amin

a.s.−−−→m→∞

ϕjk .

2 / 7

Page 6: Relatedness and differentiation in arbitrary population ......0.3 Individuals Kinship Ournewkinship estimates Genotypesfrom“HumanOrigins” (Lazaridiset al. 2014,2016; Skoglundet

Ju_h

oan_

Sou

thJu

_hoa

n_N

orth

Taa_

Wes

tTa

a_E

ast

Taa_

Nor

thN

aro

Gui

Hoa

nX

uun

Gan

aT

shw

aK

hom

ani

Nam

aH

aiom

Kga

laga

diS

hua

Khw

eM

buti

Bia

kaB

antu

SA

Tsw

ana

Dam

ara

Him

baW

ambo

Yoru

baE

san

Men

deB

antu

Ken

yaLu

hya

Man

denk

aG

ambi

an Luo

Din

kaK

ikuy

uH

adza

San

daw

eM

asai

Dat

ogS

omal

iO

rom

oJe

w_E

thio

pian

Sah

araw

iM

oroc

can

Moz

abite

Alg

eria

nTu

nisi

anLi

byan

Egy

ptia

nYe

men

iJe

w_L

ibya

nJe

w_T

unis

ian

Jew

_Mor

occa

nJe

w_Y

emen

iteS

audi

Bed

ouin

BB

edou

inA

Pal

estin

ian

Jord

ania

nS

yria

nLe

bane

se_M

uslim

Leba

nese

_Chr

istia

nJe

w_T

urki

shA

ssyr

ian

Dru

zeLe

bane

seJe

w_i

raqi

Iran

ian_

Ban

dari

Iran

ian

Jew

_Ira

nian

Turk

ish

Jew

_Ash

kena

ziC

yprio

tM

alte

seC

anar

y_Is

land

erIta

lian_

Sou

thS

icili

anR

oman

ian

Fre

nchN

Spa

nish

SW

Gre

ekIta

lian_

Nor

thS

pani

shN

EF

renc

hSS

ardi

nian

Bas

que

Orc

adia

nE

nglis

hIc

elan

dic

Nor

weg

ian

Iris

hIr

ish_

Uls

ter

Sco

ttish

She

tland

icG

erm

anS

orb

Pol

ish

Cze

chC

roat

ian

Hun

garia

nA

lban

ian

Bul

garia

nM

ordo

vian

Fin

nish

Rus

sian

Est

onia

nU

krai

nian

Bel

arus

ian

Lith

uani

anA

rmen

ian

Jew

_Geo

rgia

nG

eorg

ian

Abk

hasi

anA

dyge

iN

orth

_Oss

etia

nC

hech

enLe

zgin

Kum

ykB

alka

rN

ogai

Mak

rani

Bra

hui

Bal

ochi

Jew

_Coc

hin

Tajik

Kal

ash

Pat

han

Sin

dhi

Bur

usho

Bra

hmin

_Tiw

ari

Guj

arat

iP

unja

biV

ishw

abra

hmin

Lodh

iM

ala

Ben

gali

Kha

riaO

nge

Chu

vash

Turk

men

Uzb

ekH

azar

aU

ygur

Tuba

lar

Man

siS

elku

pK

yrgy

zA

ltaia

nTu

vini

anN

gana

san

Dol

gan

Yaku

tX

ibo

Hez

hen

Oro

qen

Ulc

hiE

ven

Yuka

gir

Itelm

enK

orya

kC

hukc

hiE

skim

oA

leut

Ale

ut_T

lingi

tK

usun

daK

alm

ykB

urm

ese

Cam

bodi

anT

hai

TuM

ongo

laD

aur

Kin

hV

ietn

ames

eD

aiLa

huN

axi

Yi

Tujia

Han

Mia

oS

heJa

pane

seK

orea

nA

mi

Ata

yal

Kan

kana

eyM

urut

Dus

unIlo

cano

Taga

log

Vis

ayan

Baj

oM

alay

Lebb

oC

hipe

wya

nC

ree

Alg

onqu

inO

jibw

aP

ima

Mix

eM

ixte

cZ

apot

ecM

ayan

Inga

Kaq

chik

elC

abec

arP

iapo

coK

ariti

ana

Sur

uiB

oliv

ian

Aym

ara

Que

chua

Gua

rani

Chi

lote

Mus

sau

Sap

osa

Buk

aTe

opT

igak

Aus

tral

ian

Kov

eM

elam

ela

Nak

anai

_Bile

kiM

angs

eng

Man

usN

asoi

Nai

likN

otsi

SW

_Bou

gain

ville

Kuo

t_K

abil

Kuo

t_La

mal

aua

Lavo

ngai

Mad

akTo

lai

Men

gen

Mam

usi

Mam

usi_

Pal

eabu

Nak

anai

_Los

oA

taS

ulka

Kol

_New

_Brit

ain

Bai

ning

_Mal

asai

tB

aini

ng_M

arab

uP

apua

n

Ju_hoan_SouthJu_hoan_North

Taa_WestTaa_East

Taa_NorthNaro

GuiHoanXuunGana

TshwaKhomani

NamaHaiom

KgalagadiShuaKhweMbutiBiaka

BantuSATswanaDamara

HimbaWamboYoruba

EsanMende

BantuKenyaLuhya

MandenkaGambian

LuoDinka

KikuyuHadza

SandaweMasaiDatog

SomaliOromo

Jew_EthiopianSaharawi

MoroccanMozabiteAlgerianTunisian

LibyanEgyptian

YemeniJew_Libyan

Jew_TunisianJew_MoroccanJew_Yemenite

SaudiBedouinBBedouinA

PalestinianJordanian

SyrianLebanese_Muslim

Lebanese_ChristianJew_Turkish

AssyrianDruze

LebaneseJew_iraqi

Iranian_BandariIranian

Jew_IranianTurkish

Jew_AshkenaziCypriotMaltese

Canary_IslanderItalian_South

SicilianRomanian

FrenchNSpanishSW

GreekItalian_North

SpanishNEFrenchS

SardinianBasque

OrcadianEnglish

IcelandicNorwegian

IrishIrish_Ulster

ScottishShetlandic

GermanSorb

PolishCzech

CroatianHungarian

AlbanianBulgarian

MordovianFinnish

RussianEstonian

UkrainianBelarusianLithuanianArmenian

Jew_GeorgianGeorgian

AbkhasianAdygei

North_OssetianChechen

LezginKumykBalkarNogai

MakraniBrahui

BalochiJew_Cochin

TajikKalashPathanSindhi

BurushoBrahmin_Tiwari

GujaratiPunjabi

VishwabrahminLodhiMala

BengaliKhariaOnge

ChuvashTurkmen

UzbekHazaraUygur

TubalarMansi

SelkupKyrgyzAltaian

TuvinianNganasan

DolganYakutXibo

HezhenOroqen

UlchiEven

YukagirItelmenKoryak

ChukchiEskimo

AleutAleut_Tlingit

KusundaKalmyk

BurmeseCambodian

ThaiTu

MongolaDaurKinh

VietnameseDai

LahuNaxi

YiTujiaHan

MiaoShe

JapaneseKorean

AmiAtayal

KankanaeyMurut

DusunIlocanoTagalogVisayan

BajoMalayLebbo

ChipewyanCree

AlgonquinOjibwa

PimaMixe

MixtecZapotec

MayanInga

KaqchikelCabecarPiapoco

KaritianaSurui

BolivianAymara

QuechuaGuaraniChilote

MussauSaposa

BukaTeop

TigakAustralian

KoveMelamela

Nakanai_BilekiMangseng

ManusNasoiNailikNotsi

SW_BougainvilleKuot_Kabil

Kuot_LamalauaLavongai

MadakTolai

MengenMamusi

Mamusi_PaleabuNakanai_Loso

AtaSulka

Kol_New_BritainBaining_MalasaitBaining_Marabu

Papuan

SAfrica MAfrica NAfrica MiddleEast Europe Caucasus SAsia NAsia EAsia Americas Oceania

SA

fric

aM

Afr

ica

NA

fric

aM

iddl

eEas

tE

urop

eC

auca

sus

SA

sia

NA

sia

EA

sia

Am

eric

asO

cean

ia

00.

10.

20.

3

Kin

ship

Indi

vidu

als

Our new kinshipestimatesGenotypes from “Human Origins”(Lazaridis et al. 2014, 2016;Skoglund et al. 2016)

Edited from Ephert [CC BY-SA 3.0], viaWikimedia Commons

*Inbreeding coeffs. on diagonal

3 / 7

Page 7: Relatedness and differentiation in arbitrary population ......0.3 Individuals Kinship Ournewkinship estimates Genotypesfrom“HumanOrigins” (Lazaridiset al. 2014,2016; Skoglundet

Ju_h

oan_

Sou

thJu

_hoa

n_N

orth

Taa_

Wes

tTa

a_E

ast

Taa_

Nor

thN

aro

Gui

Hoa

nX

uun

Gan

aT

shw

aK

hom

ani

Nam

aH

aiom

Kga

laga

diS

hua

Khw

eM

buti

Bia

kaB

antu

SA

Tsw

ana

Dam

ara

Him

baW

ambo

Yoru

baE

san

Men

deB

antu

Ken

yaLu

hya

Man

denk

aG

ambi

an Luo

Din

kaK

ikuy

uH

adza

San

daw

eM

asai

Dat

ogS

omal

iO

rom

oJe

w_E

thio

pian

Sah

araw

iM

oroc

can

Moz

abite

Alg

eria

nTu

nisi

anLi

byan

Egy

ptia

nYe

men

iJe

w_L

ibya

nJe

w_T

unis

ian

Jew

_Mor

occa

nJe

w_Y

emen

iteS

audi

Bed

ouin

BB

edou

inA

Pal

estin

ian

Jord

ania

nS

yria

nLe

bane

se_M

uslim

Leba

nese

_Chr

istia

nJe

w_T

urki

shA

ssyr

ian

Dru

zeLe

bane

seJe

w_i

raqi

Iran

ian_

Ban

dari

Iran

ian

Jew

_Ira

nian

Turk

ish

Jew

_Ash

kena

ziC

yprio

tM

alte

seC

anar

y_Is

land

erIta

lian_

Sou

thS

icili

anR

oman

ian

Fre

nchN

Spa

nish

SW

Gre

ekIta

lian_

Nor

thS

pani

shN

EF

renc

hSS

ardi

nian

Bas

que

Orc

adia

nE

nglis

hIc

elan

dic

Nor

weg

ian

Iris

hIr

ish_

Uls

ter

Sco

ttish

She

tland

icG

erm

anS

orb

Pol

ish

Cze

chC

roat

ian

Hun

garia

nA

lban

ian

Bul

garia

nM

ordo

vian

Fin

nish

Rus

sian

Est

onia

nU

krai

nian

Bel

arus

ian

Lith

uani

anA

rmen

ian

Jew

_Geo

rgia

nG

eorg

ian

Abk

hasi

anA

dyge

iN

orth

_Oss

etia

nC

hech

enLe

zgin

Kum

ykB

alka

rN

ogai

Mak

rani

Bra

hui

Bal

ochi

Jew

_Coc

hin

Tajik

Kal

ash

Pat

han

Sin

dhi

Bur

usho

Bra

hmin

_Tiw

ari

Guj

arat

iP

unja

biV

ishw

abra

hmin

Lodh

iM

ala

Ben

gali

Kha

riaO

nge

Chu

vash

Turk

men

Uzb

ekH

azar

aU

ygur

Tuba

lar

Man

siS

elku

pK

yrgy

zA

ltaia

nTu

vini

anN

gana

san

Dol

gan

Yaku

tX

ibo

Hez

hen

Oro

qen

Ulc

hiE

ven

Yuka

gir

Itelm

enK

orya

kC

hukc

hiE

skim

oA

leut

Ale

ut_T

lingi

tK

usun

daK

alm

ykB

urm

ese

Cam

bodi

anT

hai

TuM

ongo

laD

aur

Kin

hV

ietn

ames

eD

aiLa

huN

axi

Yi

Tujia

Han

Mia

oS

heJa

pane

seK

orea

nA

mi

Ata

yal

Kan

kana

eyM

urut

Dus

unIlo

cano

Taga

log

Vis

ayan

Baj

oM

alay

Lebb

oC

hipe

wya

nC

ree

Alg

onqu

inO

jibw

aP

ima

Mix

eM

ixte

cZ

apot

ecM

ayan

Inga

Kaq

chik

elC

abec

arP

iapo

coK

ariti

ana

Sur

uiB

oliv

ian

Aym

ara

Que

chua

Gua

rani

Chi

lote

Mus

sau

Sap

osa

Buk

aTe

opT

igak

Aus

tral

ian

Kov

eM

elam

ela

Nak

anai

_Bile

kiM

angs

eng

Man

usN

asoi

Nai

likN

otsi

SW

_Bou

gain

ville

Kuo

t_K

abil

Kuo

t_La

mal

aua

Lavo

ngai

Mad

akTo

lai

Men

gen

Mam

usi

Mam

usi_

Pal

eabu

Nak

anai

_Los

oA

taS

ulka

Kol

_New

_Brit

ain

Bai

ning

_Mal

asai

tB

aini

ng_M

arab

uP

apua

n

Ju_hoan_SouthJu_hoan_North

Taa_WestTaa_East

Taa_NorthNaro

GuiHoanXuunGana

TshwaKhomani

NamaHaiom

KgalagadiShuaKhweMbutiBiaka

BantuSATswanaDamara

HimbaWamboYoruba

EsanMende

BantuKenyaLuhya

MandenkaGambian

LuoDinka

KikuyuHadza

SandaweMasaiDatog

SomaliOromo

Jew_EthiopianSaharawi

MoroccanMozabiteAlgerianTunisian

LibyanEgyptian

YemeniJew_Libyan

Jew_TunisianJew_MoroccanJew_Yemenite

SaudiBedouinBBedouinA

PalestinianJordanian

SyrianLebanese_Muslim

Lebanese_ChristianJew_Turkish

AssyrianDruze

LebaneseJew_iraqi

Iranian_BandariIranian

Jew_IranianTurkish

Jew_AshkenaziCypriotMaltese

Canary_IslanderItalian_South

SicilianRomanian

FrenchNSpanishSW

GreekItalian_North

SpanishNEFrenchS

SardinianBasque

OrcadianEnglish

IcelandicNorwegian

IrishIrish_Ulster

ScottishShetlandic

GermanSorb

PolishCzech

CroatianHungarian

AlbanianBulgarian

MordovianFinnish

RussianEstonian

UkrainianBelarusianLithuanianArmenian

Jew_GeorgianGeorgian

AbkhasianAdygei

North_OssetianChechen

LezginKumykBalkarNogai

MakraniBrahui

BalochiJew_Cochin

TajikKalashPathanSindhi

BurushoBrahmin_Tiwari

GujaratiPunjabi

VishwabrahminLodhiMala

BengaliKhariaOnge

ChuvashTurkmen

UzbekHazaraUygur

TubalarMansi

SelkupKyrgyzAltaian

TuvinianNganasan

DolganYakutXibo

HezhenOroqen

UlchiEven

YukagirItelmenKoryak

ChukchiEskimo

AleutAleut_Tlingit

KusundaKalmyk

BurmeseCambodian

ThaiTu

MongolaDaurKinh

VietnameseDai

LahuNaxi

YiTujiaHan

MiaoShe

JapaneseKorean

AmiAtayal

KankanaeyMurut

DusunIlocanoTagalogVisayan

BajoMalayLebbo

ChipewyanCree

AlgonquinOjibwa

PimaMixe

MixtecZapotec

MayanInga

KaqchikelCabecarPiapoco

KaritianaSurui

BolivianAymara

QuechuaGuaraniChilote

MussauSaposa

BukaTeop

TigakAustralian

KoveMelamela

Nakanai_BilekiMangseng

ManusNasoiNailikNotsi

SW_BougainvilleKuot_Kabil

Kuot_LamalauaLavongai

MadakTolai

MengenMamusi

Mamusi_PaleabuNakanai_Loso

AtaSulka

Kol_New_BritainBaining_MalasaitBaining_Marabu

Papuan

SAfrica MAfrica NAfrica MiddleEast Europe Caucasus SAsia NAsia EAsia Americas Oceania

SA

fric

aM

Afr

ica

NA

fric

aM

iddl

eEas

tE

urop

eC

auca

sus

SA

sia

NA

sia

EA

sia

Am

eric

asO

cean

ia

−0.0

50

0.05

0.1

0.15

Kin

ship

Indi

vidu

als

Standard kinshipestimatesGenotypes from “Human Origins”(Lazaridis et al. 2014, 2016;Skoglund et al. 2016)

Edited from Ephert [CC BY-SA 3.0], viaWikimedia Commons

4 / 7

Page 8: Relatedness and differentiation in arbitrary population ......0.3 Individuals Kinship Ournewkinship estimates Genotypesfrom“HumanOrigins” (Lazaridiset al. 2014,2016; Skoglundet

Differentiation (FST) previously underestimated

0.0 0.1 0.2 0.3 0.4 0.5

020

40

Inbreeding Coefficient

Den

sity

SAfricaMAfricaNAfricaMiddleEastEuropeCaucasusSAsiaNAsiaEAsiaAmericasOceania

Subpopulations

SAfricaMAfricaNAfricaMiddleEastEuropeCaucasusSAsiaNAsiaEAsiaAmericasOceania

FST estimates

BayeScanWeir−CockerhamHudsonKNew

5 / 7

Page 9: Relatedness and differentiation in arbitrary population ......0.3 Individuals Kinship Ournewkinship estimates Genotypesfrom“HumanOrigins” (Lazaridiset al. 2014,2016; Skoglundet

00.

10.

20.

30.

4

Kin

ship

Indi

vidu

als

0.0

0.5

1.0

Individuals

Anc

estr

y fr

ac.

x

PU

RC

LMP

EL

MX

L

Pop

ulat

ion

x

AF

RA

MR

EU

R

Anc

estr

y

Kinship driven byadmixture in Hispanics

Our new kinship estimates

Genotypes from the 1000 Genomes Project (2013)

6 / 7

Page 10: Relatedness and differentiation in arbitrary population ......0.3 Individuals Kinship Ournewkinship estimates Genotypesfrom“HumanOrigins” (Lazaridiset al. 2014,2016; Skoglundet

Better kinship estimates will improve GWAS!

¡Muchas gracias!

Princeton UniversityJohn D. StoreyWei Hao

University of WarsawNeo Christopher Chung

FundingNational Institutes of HealthOtsuka Pharmaceutical

Duke StatGen CenterBiostatistics & Bioinformatics

Lewis-Sigler Institute for Integrative Genomics

7 / 7