The Multigraph for Loglinear Models

48
The Multigraph for Loglinear Models Harry Khamis Statistical Consulting Center Wright State University Dayton, Ohio, USA

description

The Multigraph for Loglinear Models. Harry Khamis Statistical Consulting Center Wright State University Dayton, Ohio, USA. OUTLINE. 1.LOGLINEAR MODEL (LLM) - two-way table - three-way table - examples 2.MULTIGRAPH - construction - maximum spanning tree - PowerPoint PPT Presentation

Transcript of The Multigraph for Loglinear Models

Page 1: The Multigraph for  Loglinear Models

The Multigraph for Loglinear Models

Harry KhamisStatistical Consulting Center

Wright State UniversityDayton, Ohio, USA

Page 2: The Multigraph for  Loglinear Models

OUTLINE1. LOGLINEAR MODEL (LLM)

- two-way table- three-way table- examples

2. MULTIGRAPH- construction- maximum spanning tree- conditional independencies- collapsibility

3. EXAMPLES

22

Page 3: The Multigraph for  Loglinear Models

Loglinear ModelLoglinear Model

Goal

Identify the structure of associations among a set of categorical variables.

33

Page 4: The Multigraph for  Loglinear Models

LLM: two variables Y1 2 3 … J Total

------------------------------------------------------------------------------1 n11 n12 n13 … n1J n1+

2 n21 n22 n23 … n2J n2+

. . . . . .

X . . . . . .. . . . . .I nI1 nI2 nI3 … nIJ nI+

Total n+1 n+2 n+3 … n+J n

44

Page 5: The Multigraph for  Loglinear Models

LLM: two variablesExample

Survey of High School Seniors in Dayton, OhioCollaboration: WSU Boonshoft School of Medicine and

United Health Services of Dayton

Marijuana Use?Yes No Total---------------------------------------------------------------------Yes 914 581 1495Cigarette Use?No 46 735 781Total 960 1316 2276

55

Page 6: The Multigraph for  Loglinear Models

LLM: two variables

66

Two discrete variables, X and Y

Model of independence: generating class is [X][Y]

Page 7: The Multigraph for  Loglinear Models

LLM: two variables

LLM of independence:

77

0

log

j

Yj

i

Xi

Yj

Xiij

where

Page 8: The Multigraph for  Loglinear Models

LLM: two variablesSaturated LLM: generating class is [XY]:

88

RatioOddsNote

where

XYij

j

XYij

i

XYij

j

Yj

i

Xi

XYij

Yj

Xiij

:

0

log

Page 9: The Multigraph for  Loglinear Models

LLM: two variables

Generating ProbabilisticInterpretation Class Model-------------------------------------------------------------------------------------X and Y independent [X][Y] pij = pi+p+j

X and Y dependent [XY] pij

99

Page 10: The Multigraph for  Loglinear Models

LLM: three variablesExample: Dayton High School Data

Alcohol Cigarette Marijuana UseUse Use Yes No----------------------------------------------------------------------------------Yes Yes 911 538

No 44 456

No Yes 3 43No 2 279

1010

Page 11: The Multigraph for  Loglinear Models

11111111

LLM: three variables

Saturated LLM, [XYZ]:

0...

log

k

XYZijk

j

XYij

i

XYij

j

Yj

i

Xi

XYZijk

YZjk

XZik

XYij

Zk

Yj

Xiijk

where

Page 12: The Multigraph for  Loglinear Models

LLM: three variablesGenerating Probabilistic

Interpretation Class Model------------------------------------------------------------------------------------mutual independence [X][Y][Z] pijk = pi++p+j+p++k

joint independence [XZ][Y] pijk = pi+kp+j+

conditional independence [XY][XZ] pijk = pij+pi+k/pi++

homogeneous association* [XY][XZ][YZ] *

saturated model [XYZ] pijk

*nondecomposable model1212

Page 13: The Multigraph for  Loglinear Models

Decomposable LLMs closed-form expression for MLEsclosed-form expression for MLEs

closed-form expression for closed-form expression for asymptotic variances (Lee, 1977)asymptotic variances (Lee, 1977)

conditional Gconditional G22 statistic simplifies statistic simplifies

allow for causal interpretationsallow for causal interpretations

easier to interpret the LLM easier to interpret the LLM

1313

Page 14: The Multigraph for  Loglinear Models

1414

Page 15: The Multigraph for  Loglinear Models

3 Categorical Variables: X, Y, and Z3 Categorical Variables: X, Y, and Z

If [X Y] and [Y Z] ⊗ ⊗then [X Z]⊗

FALSE!

1515

Page 16: The Multigraph for  Loglinear Models

LLM: three variables

Generating ProbabilisticInterpretation Class Model------------------------------------------------------------------------------------mutual independence [X][Y][Z] pijk = pi++p+j+p++k

joint independence [XZ][Y] pijk = pi+kp+j+

conditional independence [XY][XZ] pijk = pij+pi+k/pi++

homogeneous association [XY][XZ][YZ] pijk = ψijφikωjk

saturated model [XYZ] pijk

1616

Page 17: The Multigraph for  Loglinear Models

3 Categorical Variables: X, Y, and Z3 Categorical Variables: X, Y, and Z

If [Y Z] for all X = 1, 2, ….⊗then [Y Z]⊗

FALSE!

1717

Page 18: The Multigraph for  Loglinear Models

LLM: three variables

Generating ProbabilisticInterpretation Class Model------------------------------------------------------------------------------------mutual independence [X][Y][Z] pijk = pi++p+j+p++k

joint independence [XZ][Y] pijk = pi+kp+j+

conditional independence [XY][XZ] pijk = pij+pi+k/pi++

homogeneous association [XY][XZ][YZ] pijk = ψijφikωjk

saturated model [XYZ] pijk

1818

Page 19: The Multigraph for  Loglinear Models

3 Categorical Variables: X, Y, and Z3 Categorical Variables: X, Y, and Z

If [Y Z] ⊗then

[Y Z] for all X = 1, 2, 3, …⊗FALSE!

1919

Page 20: The Multigraph for  Loglinear Models

Which Treatment is Better?Which Treatment is Better? TRIAL 1 TRIAL 2 CURED? CURED?Yes No Total Yes No Total---------------------------------------------- ----------------------------------------A 40 (.20) 160 200 85 (.85) 15 100

TREATMENTB 30 (.15) 170 200 300 (.75) 100 400

Combine TRIALS 1 and 2: CURED?Yes No Total-----------------------------------------------A 125 (.42) 175 300TREATMENTB 330 (.55) 270 600

“Ask Marilyn”, PARADE section, DDN, pages 6-7, April 28, 1996

2020

Page 21: The Multigraph for  Loglinear Models

Florida Homicide Convictions Resulting in Death PenaltyML Radelet and GL Pierce, Florida Law Review 43: 1-34, 1991

Death PenaltyYes No

----------------------------------------White 53 (0.11) 430

Defendant’s RaceBlack 15 (0.08) 176

White Victim Black Victim

Death Penalty Death PenaltyYes No Yes No

------------------------------------- --------------------------------------White 53 (0.11) 414 White 0 (0.00) 16

Defendant’s RaceBlack 11 (0.23) 37 Black 4 (0.03) 139

2121

Page 22: The Multigraph for  Loglinear Models

Multigraph Representation of LLMsMultigraph Representation of LLMs

Vertices = generators of the LLM

Multiedges = edges that are equal in number to the number of indices shared by the two vertices being joined

2222

Page 23: The Multigraph for  Loglinear Models

Multigraph: three variablesMultigraph: three variables

[XY][XZ] XY XZ

2323

Page 24: The Multigraph for  Loglinear Models

Examples of MultigraphsExamples of Multigraphs

2424

[AS][ACR][MCS][MAC]

AS ACR

MAC MCS

Page 25: The Multigraph for  Loglinear Models

Examples of MultigraphsExamples of Multigraphs

2525

[ABCD][ACE][BCG][CDF]

ABCD

CDF

ACE BCG

Page 26: The Multigraph for  Loglinear Models

Maximum Spanning TreeMaximum Spanning Tree

The maximum spanning tree of a multigraph M: • tree (connected graph with no circuits) • includes each vertex • sum of the edges is maximum

2626

Page 27: The Multigraph for  Loglinear Models

Examples of maximum spanning trees Examples of maximum spanning trees

2727

[XY][XZ] XY XZ

Page 28: The Multigraph for  Loglinear Models

Examples of maximum spanning trees Examples of maximum spanning trees

2828

[AS][ACR][MCS][MAC]

AS ACR

MAC MCS

Page 29: The Multigraph for  Loglinear Models

Examples of maximum spanning trees Examples of maximum spanning trees

2929

[ABCD][ACE][BCG][CDF]

ABCD

CDF

ACE BCG

Page 30: The Multigraph for  Loglinear Models

Fundamental Conditional IndependenciesFundamental Conditional Independenciesfor a Decomposable LLMfor a Decomposable LLM

1. Let S be the set of indices in a branch of the maximum spanning tree

2. Remove each factor of S from the multigraph, M; the resulting multigraph is M/S

3. An FCI is determined as:

where C1, C2, …, Ck are the sets of factors in the components of M/S

3030

Page 31: The Multigraph for  Loglinear Models

3131

FCIs FCIs

[XY][XZ] XY XZX

S = {X}

M/S:Y Z

[Y⊗Z|X]

Page 32: The Multigraph for  Loglinear Models

Collapsibility ConditionsCollapsibility Conditions

Consider a conditional independence relationship of the form

[C1 C⊗ 2|S].

If the levels of all factors in C1 are collapsed, then all relationships among the remaining factors are

undistorted EXCEPT for relationships among factors in S.

3232

Page 33: The Multigraph for  Loglinear Models

3333

FCIs FCIs

[XY][XZ] XY XZX

S = {X}

M/S:Y Z

[Y⊗Z|X]

Page 34: The Multigraph for  Loglinear Models

Example: Ob-Gyn StudyExample: Ob-Gyn Study(Darrocca, et al., 1996)

n = 201 pregnant mothers

Variables: E: EGA (Early, Late)B: Bishop score (High, Low)T: Treatment (Prostin, Placebo)

3434

Page 35: The Multigraph for  Loglinear Models

Example: Ob-Gyn StudyExample: Ob-Gyn Study

BISHOP SCORE (B)High Low

EGA (E) EGA (E)TREATMENT (T) Early Late Early Late

------------------------------------------------------------------------------------------------------Prostin 34 24 27 21

Placebo 22 16 35 22

Best-fitting model: [E][TB]

3535

Page 36: The Multigraph for  Loglinear Models

Example: Ob-Gyn StudyExample: Ob-Gyn Study

Generating Class: [E][TB]

Multigraph:

E TB

FCI: [E T,B]⊗

3636

Page 37: The Multigraph for  Loglinear Models

Example: Ob-Gyn StudyExample: Ob-Gyn StudyCollapsed Table (collapse over EGA):

BISHOP SCORE (B) High Low Total

-------------------------------------------------Prostin 58 (0.55) 48 106

TREATMENT (T)Placebo 38 (0.40) 57 95

P = 0.037

3737

Page 38: The Multigraph for  Loglinear Models

Example: WSU-United Way StudyExample: WSU-United Way Study

M: Marijuana (No, Yes)

A: Alcohol (No, Yes)

C: Cigarettes (No, Yes)

R: Race (Other, White)

S: Sex (Female, Male)

Observed cell frequencies (n = 2,276):

12 0 19 2 1 0 23 23117 1 218 13 17 1 268 40517 0 18 1 8 1 19 30133 1 201 28 17 1 228 453

3838

Page 39: The Multigraph for  Loglinear Models

Example: WSU-United Way StudyExample: WSU-United Way Study

Generating class: [ACE][MAC][MCG]

Multigraph, M:

ACE MCG MAC

3939

Page 40: The Multigraph for  Loglinear Models

Example: WSU-United Way StudyExample: WSU-United Way StudyM: S = {A,C}

ACE M/S: E A C MG M

MCG MAC [E M,G|⊗ A,C]

A = Alcohol C = Cigarette E = EthnicG = Gender M = Marijuana

4040

Page 41: The Multigraph for  Loglinear Models

Example: WSU PASS ProgramExample: WSU PASS Program

“Preparing for Academic Success”

GPA below 2.0 at the end of first quarter

4141

Page 42: The Multigraph for  Loglinear Models

Example: WSU PASS ProgramExample: WSU PASS Program

Variables (n = 972):

FACTOR LABEL LEVELS--------------------------------------------------------------------------------------------------------------Retention R 1=No, 2=YesCohort C 1, 2, 3, 4PASS Participation P 1=No, 2=YesEthnic Group E 1=Caucasian, 2=African-American, 3=OtherGender G 1=Male, 2=Female

4242

Page 43: The Multigraph for  Loglinear Models

Example: WSU PASS ProgramExample: WSU PASS Program

The best-fitting LLM has generating class [EG][CP][RC][PG]

Multigraph, M: G

EG PG P

RC C CP 4343

Page 44: The Multigraph for  Loglinear Models

Example: WSU PASS ProgramExample: WSU PASS ProgramM: S = {C}

EG PG EG PG

RC CP R PC M M/S

[E,G,P⊗R|C]

C = Cohort E = Ethnic G = GenderP = PASS Participation R = Retention

4444

Page 45: The Multigraph for  Loglinear Models

Example: Affinal Relations in Bosnia-HerzegovinaExample: Affinal Relations in Bosnia-HerzegovinaData courtesy of Dr. Keith Doubt, Department of Sociology, Wittenberg University, Springfield, Ohio

N = 861 couples from Bosnia-Herzegovina are surveyed concerning affinal relations.

M: Marriage Type (traditional, elopement)L: Location of Man and Wife (same, different)E: Ethnicity (Bosniak, Serb, Croat)S: Settlement (rural, urban)

Best-fitting model: [MLES]

Consider structural associations among M, L, and S for each ethnic group (E) separately.

4545

Page 46: The Multigraph for  Loglinear Models

Example: Affinal Relations in Bosnia-Herzegovina Example: Affinal Relations in Bosnia-Herzegovina

Bosniaks: [ML][LS]

Serbs: [MS][SL]

Croats: [M][L][S]

M: Marriage Type L: Location of Man and Wife S: Settlement

4646

Page 47: The Multigraph for  Loglinear Models

ConclusionsConclusions The generator multigraph uses mathematical graph theory to

analyze and interpret LLMs in a facile manner

Properties of the multigraph allow one to:– Find all conditional independencies – Determine all collapsibility conditions

REFERENCEKhamis, H.J. (2011). The Association Graph and the Multigraph for Loglinear Models,

SAGE series Quantitative Applications in the Social Sciences, No. 167.

4747

Page 48: The Multigraph for  Loglinear Models

Without data, you’re just one more person with an

opinion4848