How to Build Social Science into Social Computingsunki/talk/cpscom{2011}.pdf · How to Build Social...

36
How to Build Social Science into Social Computing Sun-Ki Chai Dept. of Sociology University of Hawai`i

Transcript of How to Build Social Science into Social Computingsunki/talk/cpscom{2011}.pdf · How to Build Social...

How to Build Social Science into Social Computing

Sun-Ki Chai Dept. of Sociology

University of Hawai`i

Internet is unifying the research agenda of the computational and social sciences.

Technology <-> Social Interaction

One would expect a renaissance in science as a unitary discipline without boundaries.

Not.

Computer scientists/engineers lead funded research on modeling social interaction.

Social scientists generally come into

projects as de facto SMEs. An afterthought to meet RFP requirements.

•  Why so little social science

in predictive social modeling?

•  From engineer’s point of view: – Far too many competing theories – Too “vague” to put into algorithmic form or

lacking in technical interest – Social scientists don’t understand software

concepts and standards – Most papers are devoted to observations of

human subjects or societies, not new theories.

• 

Why so little social science in predictive social modeling?

•  From social scientist’s point of view: – Most are only vaguely aware of social

computing – However, even under more pressure to

collaborate can’t find entry point. – Disillusioned when collaboration occurs.

A few tentative solutions

•  Learn to think like a social scientist •  Find social scientists who are formal

modeling-aware (there are plenty) •  There is always a confirmed theory to fit

your needs – use it •  Avoid “parallel play” in projects •  Talk, talk, talk •  Find ways to make project rewarding

within-discipline for each participant

Mainstream Social Science Methodology

Deductive/Nomothetic approach

•  General Theory from existing literature •  Falsifiable hypotheses about specific empirical

phenomena – operationalization •  Ensure representative sample data drawn from

human population •  Hypotheses are confirmed or falsified, theories

subject to modification w/o loss of generality or simplicity

•  Rinse and repeat.

Social Networks Models: The Early Days

•  Notion of formally analyzing social networks originates in “classical” sociology, particularly the work of Georg Simmel (1908) –  Group size; dyads vs. triads –  Tertius Gaudens (third who enjoys)

•  Also central in gestalt and attitudinal psychology –  Jacob Moreno’s sociograms (1931) –  Fritz Heider’s balance theory (1958)

The Development and Use of Centrality Measures

•  Centrality as a developing concept –  Early concepts of centrality

•  Alex Bavelas (1948; 1950) - communication •  Marvin Shaw (1954) - small group behavior

–  Degree, betweenness, closeness codified •  Linton Freeman (1979)

–  Eigenvector → PageRank •  Phil Bonacich 1987

–  Information •  Stephenson and Zelen 1989

–  Flow (Betweeness) •  White and Smith 1989 (contested)

–  Individual and group, local and global centrality

•  Centrality as predictor of prestige/status in organizations (primarily business organizations).

Structural Holes vs. Closure: Most Prominent Empirical Debate in Sociology

•  Structural Holes (Burt 1995) as a challenge to Centrality –  Connecting two otherwise

unconnected communities (tertius jungens)

•  Longtime debate on relative causal importance of structural holes vs. closure (Coleman 1990) for prestige, SES

J.S. Coleman, Frontiers of Social Theory, p. 318-320

Social Science Content Analysis

•  Focus on the nature of political communication (1935) and propaganda (1939)

•  Based on the use of conceptual dictionaries

•  Typical Dictionaries contain 100’s of concepts

•  “Lasswell” dictionary still in use as a measure of ideology

Other Milestones in Content Analysis History

•  Payne Fund Studies (1928) – examined content of movies and effects on children’s attitudes and knowledge

•  Victor Raimy (1948) – first automated affective (sentiment) analysis – conversation between counselor and client

•  Robert Bales (1950) – interaction process analysis – ties with symbolic interaction

•  Harold Garfinkel (1967) – conversation analysis in ethnomethodology

•  Philip Stone (1966) – first general concept computerized text analysis – Harvard Third Psychosociological Dictionary

•  Rick Holmes and Joe Woelfel (1982) – demonstration of content analysis without large mainframe – focus on communication theory

How Social Science Content Analysis Process Differs from Text Mining/Sentiment Analysis

•  Content analysis is a general method for extracting social meaning from artifacts (texts, pictures, videos, physical goods)

•  Theories of meaning are the backdrop for deriving latent (meaning) content from manifest (observable at the surface) symbols

•  Even when automated content analysis is goal, subjective coding is almost always one step of the process, but it is should be guided by theory –  Whenever possible, subject matter experts are chosen for

coding –  Accuracy of subjective encoding is checked by intercoder

reliability •  Scott's π, Cohen's κ, Krippendorff's α

•  After coding has is compiled into a codebook or “dictionary”, it is applied across to wide sample of artifacts, measuring individual term and concept frequencies, then repeatedly testing for accuracy.

Social Representativeness of Your Data

•  Defining the artifact: choose unit of analysis mapping most directly onto social phenomenon being modeled –  For the web, if we are looking to measure individual or group

sentiments, the page is an inappropriate unit •  Determining appropriate study population

–  Be selective in identifying only web sites that represent your target real-world population, but identifying all that do so.

•  all of those who may mention the issue in passing? •  members of virtual communities centering around the issue?

•  Census or sample? –  If your population is very large, you may have to look at only a

subset. –  If sample, what is your sampling frame? –  What is your sampling method – simple random, stratified,

cluster, etc.? •  How do you correct for bias?

–  Deposit and Survival bias: stratifying on bias characteristics

Formal Rational Choice

•  Expected utility theory (solitary action) –  Preferences: strict order, completeness, asymmetry (=

irreflexivity and acyclicity), and transitivity •  Cardinal/Interval Preferences represented by Utility

function •  Conventionally: Egoistic, Materialistic, Isomorphic

–  Beliefs: based on observations and legal (logical, probablistic) inferences

•  Conventionally: No other source of beliefs than the above

–  Decision-Making: optimization - maximize expected utility in light of beliefs

•  Game theory (collective action) –  Strategic uncertainty –  Common knowledge of rationality

Exchange Theory

•  Attempt to in sociology to synthesize rational choice, behaviorism and culture –  Homans (1958, 1960) views social

interactions as exchanges –  Blau (1964) examines how

hierarchical relations can transform exchange into coercion

–  Emerson (1968) developed pioneering notion of exchanges embedded in social relationships

Experiments on Networks, Exchange and Power (Sociological Social Psychology)

Emerson’s “Children”: •  Karen Cook •  Linda Molm •  David Willer •  Toshio Yamagishi

Main concepts: dependence, power, trust, fairness

Formalisms apply game theory framework

Formal Models of Culture and Cultural Change

•  How to guarantee a unique game solution? Culture – Culture as both tiebreaker (focal point, template,

toolkit) and game-changer (altruism, norm convergence)

•  General Cultural Typologies –  Individualism/Collectivism – Grid/Group Cultural Theory (Douglas, Wildavsky) – Personality Inventories , e.g. Big 5

•  General Models of Cultural Change •  Dissonance •  Social Construction •  Narrative Theories

– Coherence Model

Standard Expected Utility Setup

Actions in Choice Set: A = {a1 …an }

States of Nature: S = {s1 …sk}

Utility Function: U(a,s), a ∈A, s ∈S

Subjective Probabilities: 0 ≤ p(s) ≤ 1: ∑ s ∈S p(s) = 1

Expected Utility V(a,s) =E(U(a,s)) =

∑S p(s) U(a,s)

PREFERENCE AND BELIEF ASSUMPTIONS OF MODEL •  Meta-optimization •  Environment constrains Beliefs •  No “Yogic Utility” Parametric form, but not parametric values, determined by exposure to social communication Forms considered in order of message prevalence of communications describing such forms, but parameter weightings can be accepted or rejected. .

Concepts and Assumptions of Coherence Model

Determination of Expected Regret

State-specific Optimal Action: a*(s) = argmax s∈S u(s,a) ∀ s ∈S: ∃ a* ∈A s.t. V(a*,s) ≥ V(a,s), ∀ a ∈A

Expected Regret

d(a) = Σs p(s) (U(s,a*(s)) – U(s,a)))

Actors adjust preferences (U) and beliefs (p) under constraints in order to minimize expected regret

CCPV Project 1.  .

Retrospective testing team Laboratory experiment team Field experiment team Ethnic dataset team Virtual communities crawler team Identity based conflict simulation team .

Teams

Community

Candidate sites

Single page

Community sites

SEED SITE SEED SITE LINKS=14 CONTENT=0.89

LINKS=10 CONTENT=0.65

LINKS=8 CONTENT=0.78

Iterative Community Development

Charting Forum and Keyword Sentiment Over Time

Significant Pearson Correlations of Content Categories and Grid/Group

psychological categories

personal concerns

sad percept see social humans ingest home relig death

Grid 0.2490* -0.2380* -0.2905* 0.2081 0.1997 -0.2524* -0.2399* 0.1304 0.1002

p-value 0.0349 0.0441 0.0133 0.0795 0.0927 0.0324 0.0424 0.2751 0.4025

Group 0.1838 -0.0555 -0.1023 0.2626* 0.3159* -0.2760* -0.0783 0.2668* 0.2643*

p-value 0.1223 0.643 0.3927 0.0259 0.0069 0.019 0.5131 0.0235 0.0248

N 72 72 72 72 72 72 72 72 72

* Yellow-shaded boxes indicate p < 0.05, two-tailed test. Content categories taken from LWIC content analysis dictionary, Pennebaker 1994.

Grid-Group, Social Norms, and Effect on Behavior in the Lab

•  Hypotheses: –  high group people act cooperate more in social dilemma games,

offer more in ultimatum bargaining games, and trust more in trust games;

–  High grid people show strong reciprocity, i.e. punish non-contributors violators at cost to self, return more if trusted, reject low offers in ultimatum game

Research question: How do people’s grid and group

cultural attributes affect their social behavior?

Approach: measure individual grid and group

scores using a survey instrument, then correlate them with the behavior in various formal games

CCPV CCPV CCPV

Voluntary Contributions Mechanism Correlations

Shuffled Partner Pooled

Contribution

Level, No Pun

Contribution, with

Punish

Punish ment

Expernditure

Contribution

Level, No Pun

Contribution, with

Punish

Punish ment

Expernditure

Contribution

Level, No Pun

Contribution,

with Punish

Punishment

Expernditure

Grid

Pearson correlation -0.100 0.038 0.263** 0.261* -0.075 0.056 0.108 -0.002 0.132 Sig. (2-tailed) 0.392 0.748 0.022 0.052 0.582 0.683 0.217 0.983 0.131

Group

Pearson correlation 0.350*** 0.096 -0.116 0.251* 0.171 -0.218 .319*** 0.138 -0.155*

Sig. (2-tailed) 0.002 0.411 0.318 0.062 0.208 0.106 0.000 0.114 0.077

N 76 76 76 56 56 56 132 132 132

***Correlation is significant at the 0.01 level (2-tailed). ** Correlation is significant at the 0.05 level (2-tailed). * Correlation is significant at the 0.10 level (2-tailed).

Theoretical  Team:  Grid-­‐‑Group  and  Coherence Groupness-transformed payoff:

yi = (Σj<>i gij xj ) + xi Gridness-transformed payoff:

ui = yi (ord(ai = oj) + (1 –hij) ord(ai <> oi)) where gij and hij are group and grid coefficients for individual i vis-à-vis individual j, ai is her action, xj is untransformed payoff, and oi her specified operation under the social norms of her group. Coherence (preference-based): adjustment of g, h to minimize d Expected Regret (single-period, individual form):

d = ∫s (u(s,a*(s)) – u(s,a))) p(s) ds where

a*(s)=argmax a∈A u(s,a) a=argmax a∈A, s∈S ∫s u(s,a) p(s) ds

s states of the environment, a actions, u utility function, and p subjective probabilities

Grid-Group and Coherence

Sweden

W Germany E Germany

Basque

Norway

Finland Galicia

UK

Japan

Australia

Valencia

Crotia Latvia

Slovinia

Spain

Estonia

Bulgaria

USA

Argentina

Andalusia

S.Korea

Serbia Russia

Montenegro Ukraine Taiwan

Dominica Belarus

Peru Lithuania Puerto Rico

S.Africa Mexico Moldova

Philippines

Bangladesh

Nigeria Ghana

Brazil Venezuela

India Macedonia

Azerbaijan China Bosnia Herceg

Armenia Chile Georgia Poland

Colombia

Pakistan

Turkey Tambov

.4

.45

.5

.55

.6

grou

p

.3 .4 .5 .6 .7 .8 grid

Relative Grid-Group scores in Wave 3 of WVS

The Moro and its layers of ethnic identities

Local/geographic/ cultural ethnic identity: 13 Moro ethnolinguistic groups

1)  Maranao

2)  Maguindanao

3)  Tausug

4)  Yakan

5)  Cotabato

*they go their separate ways if they differ in POLITICS.

“Muslim Filipino” socio-political identity and situational ethnicity). It is not a national identity. Muslim Filipinos vs Christian Filipinos_

Indo-Malay Muslim regional identity (shared heritage with Muslims of Southeast Asia) religious identity as moderate Muslims

Pan-Muslim global identity based on the affiliation to the Ummah (hence they can appeal to Muslim Arabs for external support to the Muslim world). Muslim vs Non-Muslims

Moro is a socio-political identity based on symbolic ethnicity rooted on the fight for the Bangsamoro nation. It is an ethnic resource used by the Moro National Liberation Front, Moro Islamic Liberation Front & Abu Sayyaf Group to pursue its goals.

Individual identity based on situational ethnicity

Cultural and Intercultural Attitudes among Moro and Non-Moro Groups

Number of participants 328 (including pilot) 306 (excluding pilot)

Distribution by site Taguig = 102 (33.3%) Culiat = 104 (34.0%) Greenhills = 100 (32.7%)

Distribution by religion Islam = 192 (62.6 %) Christian = 114 (37.3%)

Moro Distribution by Ethnicity (defined by language) Maranao = 85 (27.8%) Maguindanao = 31 (10.1%) Tausog = 36 (11.2%) Yakan = 15 (4.9%) Balik Islam = 17 (5.6%) Other Muslim (Iranon/Kalagan/Samal) = 8 (2.6%)

Field Study: Risk Preferences Among Moro Ethnic Groups

0.1

.2.3

.40

.1.2

.3.4

0.1

.2.3

.4

0 2 4 6 0 2 4 6 0 2 4 6 0 2 4 6

balik islam bisaya cebuano ilocano

ilonggo maguindanao maranao other christian

other muslim tagalog tausog yakan

Den

sity

riskGraphs by eth2

Contructionist Behavioral Simulation

Behavioral Modeling of Coalition Formation

Ethnic Groups are Predicted, not taken as Given