Big Social Data: The Social Turn in Big Data
-
Upload
richard-heimann -
Category
Documents
-
view
115 -
download
1
description
Transcript of Big Social Data: The Social Turn in Big Data
DATA is the new OIL…
Richard Heimann © 2011
Long Tail of data science…
Long Tail: Intelligence Reporting, Science Data – Dark Data
Head: Big Data
Head: Big Data – Large continuous datasets coincident over Time & Space. Ideal for multivariate analysis.Tail {power law distribution} is good for business but suboptimal for governance. Data in tail is often unmaintained beyond their initially designed use case and individually curated. As a result, the data is discontiguous from other research efforts and discontinuous over space and time. Dark data is suspected to exist or ought to exist but is difficult or impossible to find. The problem of dark data is real and prevalent in the tail. The long tail is an intractably large management problem.
Richard Heimann © 2011
Long Tail of data science…
Head TailHomogenous Heterogeneous
Centralized curation Individual curation
Maintained Unmaintained
Continuous over S & T Discontinuous over S & T
Visibly accessible DARK Data
High Velocity Slow or NO velocity
High Volume Low Volume
Richard Heimann © 2011
Power law 80% 20%Number of Grants 7,478 1,869
Dollar Amount $938,548,595 $1,199,088,125
Total Grants (NSF07) 9,347 (Count) $2,137,636,716 (Amount)
Long Tail of NSF data…
Richard Heimann © 2011
Honest Signals
Random O&D Variable Xt
500 meter cell
Total O&D Variable Xt Count – 500 meter cell
Deviate from spatial randomness suggests underlying social processes.
“Every observable effect has a cause” (Thales)
Perhaps the most profound honest signal is a rejection of the randomness.
Honest Signals – Spatial Randomness
Social Radar…
…what LTG Michael Flynn calls a warning system to inform policy makers of potential crises ‘left of boom’ …it yields
understanding of the human landscape advancing beyond standard terrain features.
Humans geosensors represent social radar; particularly in areas unmarked by conflict and in sensor poor
environments.
Richard Heimann © 2011
What is Big Social Data?
Anyon (1982) that social science should be empirically grounded, theoretically
explanatory and socially critical.
Big Social Data is all those things with an emphasis on socially critical.
The Social Turn of Big Data is upon us…Richard Heimann © 2011
What is Big Social Data?
What is Big Social Data?• Big Data + Tuned Social Consciousness. • Velocity, volume, variety, veracity.• Inward & outward asymptotics.• Social + Spatial
Richard Heimann © 2011
Big Data, Small Theory
Statistically significant global variables that
exhibit strong regional variation
inform nuanced local decisions.
Statistically significant global variables that
exhibit strong regional variation
inform different local decisions.
Statistically significant global variables that exhibit little regional
variation inform region wide policy.
Richard Heimann © 2011
Spatial Simpson’s Paradox
Global standards will always compete with local social phenomenon.
Violence in the south
Violence in the north
Violence in the south
Violence in the north
Violence
Big Data, Small Theory
Global models average regionally variant phenomenon. Local models account for regional variation.
Richard Heimann © 2011
Big Data, Small Theory
Richard Heimann © 2011
Stationarity Extreme Heterogeneity
Multiple Equilibrium: One process for every observation over space.
Single Equilibria: A singular process over space and across study area.
Big Data, Small Theory
Richard Heimann © 2011
Lacking Internal Validity Lacking External Validity
Observed to be generally true
Sufficient generality to be useful as a norm
Deviations from the law should be interesting
Understanding of social process in context
…the Nomothetic & Idiographic debate is solved!!
Richard Heimann © 2011
Big Social Data?
…building better analytics.
learning more about our problems.
constructing local variant, regionally flexible small theory.
Improve policy and decisions!
Big Data, Small Theory
Richard Heimann © 2011
Big Data, Small Theory
179 large companies found that adopting "data-driven decision-making" achieved productivity gains that were 5 percent to 6 percent higher than other factors could explain.
• What if we could improve policy or intelligence analysis by 5 or 6 percent?
• What if we could improve decision support by at least 5 percent?
• What if we could improve productivity by at least 5 percent?
Richard Heimann © 2011
The End of Theory
Richard Heimann © 2011