Academy of Management Montreal, 6 August 2010 Empirical Exploration of Complexity in Human Systems:...
-
Upload
janel-fisher -
Category
Documents
-
view
214 -
download
2
Transcript of Academy of Management Montreal, 6 August 2010 Empirical Exploration of Complexity in Human Systems:...
Academy of Management
Montreal , 6 August 2010
Empirical Exploration of
Complexity in Human Systems:
Data Collection & Interpretation Techniques
Power law statistics and Pareto Science
Pierpaolo Andriani
Durham Business School, University of Durham, UK
sampling & inferenceLIKELIHOOD distribution:
PROB (data given population)
INFERENTIAL distribution:
PROB (population given data)
sample
sample mean
sample variance
etc.
sample size
population
mean
variance
etc.
size is infinite
infer a value
take a sampleStatistical inference: Drawing conclusions
about the whole population on the basis of
a sample
Precondition for statistical inference:
A sample is randomly selected from the
population (=probability sample)
Representative agent links
population to sample level and
allows reduction of population
complexity to single agent
complexity
From Starbuck: The production of knowledge (2006)
• Consensus favoring use of null-hypothesis significance tests affords a clear example of paradigm stability. Although methodologists have been trying to discourage the use of these tests since the 1950s, the tests have remained very prevalent, and there is no sign that social scientists are shifting to other criteria. …. Hubbard and Ryan (2000: 678) concluded: ‘It seems inconceivable to admit that a methodology as bereft of value as SST (statistical significance tests) has survived, as the centerpiece of inductive inference no less, more than four decades of criticism in the psychological literature’.
p. 77
Starbucks: The production of knowledge (2006)
Starbucks: The production of knowledge (2006)
• Choosing two variables utterly at random, a researcher has 2-to-1 odds of finding a significant correlation on the first try, and 24-to-1 odds of finding a significant correlation within three tries. … the main inference I drew from these statistics was that the social sciences are drowning in statistically significant but meaningless noise. Because the differences and correlations that social scientists test have distributions quite different from those assumed in hypothesis test, social scientists are using tests that assign statistical significance to confounding background relationships. Because social scientists equate statistical significance with meaningful relationships, they often mistake confounding background relationships for theoretically important information. One result is that social science research creates a cloud of statistically significant differences and correlations that not only have no real meaning but also impede scientific progress by obscuring the truly meaningful relationships.
p. 49
Starbucks: The production of knowledge (2006)
• I began to think of statistical tests as arcane rituals that demonstrate membership in an esoteric subculture
p. 18
Gaussian Paretian
A tale of two worlds
World
Statistics Bell distribution
(finite variance distributions)
Pareto
(infinite variance)
Relations btw UoA Independence (or weak interdepencdence) Interdependence
Linear science (principle of superposition)Scientific ‘approach’ Non-linear science
Phenomena have proper scaleScaling property Phenomena are fractal
‘Things’, entities Relations Unit of analysis
Property of world Closure Openness
Parmenides, Plato, NewtonPhilosophical origin Eraclitus, Aristotile
LimitedVariability Unbounded
Gaussian Paretian
Bell curve distribution
of node linkages
Exponential Network
Power-law distribution
of node linkages
Scale-free Network
Num
ber
of n
odes
Number of links
Typical node
No large number
Num
ber
of n
odes
Number of links
Num
ber
of n
odes
(lo
g sc
ale)
Number of links (log scale)
From Barabasi/Bonabeau, Scientific American, May 2003
ignores or downplays
extreme events on the right
hand side of the distribution
but also ignores or
downplays tiny initiating
events on the left hand side
of the distribution
By assuming finite variability and compressing data around mean/variance,the
Gaussian approach
http://www.zazzle.com/statisticians_do_it_within_3_standard_deviations_tshirt-235087605979353103
Rationality, stock market and the butterfly effect
Growth-related power laws - ratio imbalances
1Surface /
volume Law
Organisms; villages: In organisms, surfaces absorbing energy grow by the square but the organism grows by the volume, resulting in an imbalance (Galileo 1638, Carneiro 1987); fractals emerge to bring surface/volume back into balance. West and Brown (1997) show that several phenomena in biology such as metabolic rate, height of trees, life span, etc. are described by allometric power law whose exponent is a multiple of ±¼. The cause is a fractal distribution of resources. Allometric power laws hold across 27 orders of magnitude (of mass).
2Least effort
Language; transition: Word frequency is a function of ease of usage by both speaker/writer and listener/reader; this gives rise to Zipf’s (power) Law (1949); now found to apply to language, firms, and economies in transition (Ferrer i Cancho & Solé, 2003; Dahui et al., 2005; Ishikawa, 2005; Podobnik et al., 2006).
3Hierarchical modularity
Growth unit connectivity: As cell fission occurs by the square, connectivity increases by n(n–1)/2, producing an imbalance between the gains from fission vs. the cost of maintaining connectivity; consequently organisms form modules so as to reduce the cost of connectivity; Simon argued that adaptive advantage goes to “nearly decomposable” systems (Simon, 1962; Bykoski, 2003). Complex adaptive systems: Heterogeneous agents seeking out other agents to copy/learn from so as to improve fitness generate networks; there is some probability of positive feedback such that some networks become groups, some groups form larger groups & hierarchies (Kauffman, 1969, 1993; Holland, 1995).
Combinations
4Interactive Breakage
theory
Wealth; mass extinctions/explosions: A few independent elements having multiplicative effects produce lognormals; if the elements become interactive with positive feedback loops materializing, a power law results; based on Kolmogorov’s “breakage theory” of wealth creation (1941).
5Combination
theory
# of exponentials; complexity: Multiple exponential or lognormal distributions or increased complexity of components (subtasks, processes) sets up, which results in a power law distribution (Mandelbrot, 1963; West & Deering, 1995; Newman, 2005).
6Interacting
fractals
Food web; firm & industry size, heartbeats: The fractal structure of a species is based on the food web (Pimm, 1982), which is a function of the fractal structure of predators and niche resources (Preston 1950; Halloy, 1998; Solé & Alonso, 1998; Camacho & Solé, 2001; Kostylev & Erlandsson, 2001, West, 2006).
Positive feedback loops
7Preferential attachment
Nodes; gravitational attraction: Given newly arriving agents into a system, larger nodes with an enhanced propensity to attract agents will become disproportionately even larger, resulting in the power law signature (Yule, 1925; Young, 1928; Arthur, 1988; Barabási, 2000).
8Irregularity generated gradients
Coral growth; blockages: Starting with a random, insignificant irregularity, coupled with positive feedback, the initial irregularity increases its effect. This explains the growth of coral reefs, blockages changing the course of rivers, (Juarrero, 1999; Turner, 2000; Barabási, 2005). Diffusion limited accretion (DLA). See also “niche constructionism” in biology (Odling-Smee, 2003)
Contextual effects
9Phase
transitions
Turbulent flows: Exogenous energy impositions cause autocatalytic, interaction effects and percolation transitions at a specific energy level—the 1st critical value—such that new interaction groupings form with a Pareto distribution (Bénard, 1901; Prigogine, 1955; Stauffer, 1985; Newman, 2005).
10Self-
organized criticality
Sandpiles; forests; heartbeats: Under constant tension of some kind (gravity, ecological balance, delivery of oxygen), some systems reach a critical state where they maintain stasis by preservative behaviors—such as sand avalanches, forest fires, changing heartbeat rate—which vary in size of effect according to a power law (Bak et al., 1987; Drossel & Schwabl, 1992; Bak, 1996).
11Niche
proliferation
Markets: When production, distribution, and search become cheap and easily available, markets develop a long tail of proliferating niches containing fewer customers; they become Paretian with mass-market products at one end and a long tail of niches at the other (Anderson, 2006).
Gaussian – heights of individualsTallest man (Robert Pershing Wadlow) 272 cm
Shortest man (He Pingping) 74 cm
Ratio: = 3.7
Source : Lada Adamic - http://www.hpl.hp.com/research/idl/papers/ranking/ranking.html
Source: Bak (1996) “How Nature Works”
Krugman on the Zipf law:
“we are unused to seeing regularities this
exact in economics – it is so exact that I find
it spooky” (1996) p.40
Largest city (Mumbai) population 13,922,125
Smallest city (Hum, Croatia ) pop. 23
Ratio: = 605310
Paretian: city size
Hum, CroatiaMumbai, India
Two tails of a power law
Casti _126
Find gutemberg
Ricther-Gutenberg Law
Earthquake magnitude (mb
) ~ Log E
Nc
(Ear
thqu
akes
/Yea
r)
Extreme events tail
Small events tail
Main properties of Paretian distributions
• Moments: Pr[X ≥ x] = k*x-α
• Largest value: • maximum value depends on size of sample• highly skewed distribution (80/20 Rule)
• Scaling property:p(bx) = g(b)p(x) for any b
Moments of distributions
• 3rd: Skewness
• 4th: Kurtosis
number of AOL visitors to
other websites in 1997*
* Lada Adamic, Zipf, Power-laws, and Pareto - a ranking tutorial, http://www.hpl.hp.com/research/idl/papers/ranking/ranking.html
1st: average
– Representative?
– Stable?
2nd: variance
– Finite or unbounded?
– Stable?
Largest value
• Financial markets
Central limit theorem doesn’t apply. No convergence to the mean, no central tendency.
The world shows an unlimited and irreducible stock of surprises!
Scalability
Scalability
Scalability
Scalability
Scalability
Scalability
Scalability in financial markets
Traditional statistics assume bell-shaped distribution, with
typical scale (mean) and rapidly decaying tails Power-law distributions show no mean (scale-free) and exhibit
long fat tails (infinite variance). A PL explores the maximum
dynamic range of diversity of the variable, limited only by size of
network and agent.
Neo-classical economics and equilibrium-based management
theories assume normal distributions and descriptive/behavioral
parameters gathering around means. Extreme events are very rare
and therefore negligible
Extreme events are more frequent and their magnitude is
disproportionately bigger than in the bell distribution case.
Which approach to statistics?
Challenge: manage the population
– How: reduce population to the representative agent and define
variance (of population)
– Manage around mean and variance
In a Paretian world:In a Gaussian world:
Challenge: manage the frontier
– Identify outliers and manage the tails (together with the bulk) of
the distribution
– Manage the tails
Scale-free theories
– The growth of most systems follows a set of scaling trends that
link tiny initiating events with more significant or even extreme
outcomes.
Change: gradualism
– EEs are exceedingly rare and can be treated as perturbation
(system restores equilibrium after transient)
Change: extreme events
– EEs arise in the tails and determine the industry next structure
Scale-free theories
– Don’t exist in Gaussian systems
The danger of averages
Thank you
Any questions?