Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS [email protected].
-
Upload
everett-bruce -
Category
Documents
-
view
220 -
download
1
Transcript of Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS [email protected].
![Page 2: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/2.jpg)
Contents
• Correlation
• Causality
• Examples
• Causal Research
• Causality Techniques:
• Granger Causality
• Zhang Causality
• Peter Causality
• LINGAM Causality
• Practical Applications
• Conclusion
2
![Page 3: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/3.jpg)
Correlation• Correlation means how closely related two sets of data
are
• In statistics, Dependence refers to any statistical relationship between two random variables or two sets of data. Correlation refers to any of a broad class of statistical relationships involving dependence.
[wiki : http://en.wikipedia.org/wiki/Correlation_and_dependence]
• Relates to closeness, implying a relationship between objects, people, events, etc.
For example, people often believe there are more bizarre behaviors exhibited when the moon is full.
3
![Page 4: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/4.jpg)
Causality
• Causality (also referred to as causation) is the relation between an event (the cause) and a second event (the effect), where the second event is understood as a consequence of the first.
[Random House Unabridged Dictionary]
4
![Page 5: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/5.jpg)
Correlation ExamplesDrivers Age vs Sign Legibility
distance
Driver’s age is negatively correlated with Sign Legibility Distance5
![Page 6: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/6.jpg)
Speed vs Fuel Consumption
6
![Page 7: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/7.jpg)
Speed is correlated with the fuel consumption by the vehicle
Speed vs Fuel Consumption
7
![Page 8: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/8.jpg)
Incentive vs Percentage Returned
8
Incentive is positively correlated with the Percentage Returned
![Page 9: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/9.jpg)
9
Gun ownership vs Crime rate
Gun ownership and crime
r = .71
Gun Ownership correlates positively with crime rate
![Page 10: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/10.jpg)
In a Gallup poll, surveyors asked, “Do you believe correlation implies
causation?”
• 64% of American’s answered “Yes” .
• 28% replied “No”.
• The other 8% were undecided.
10
![Page 11: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/11.jpg)
See 10 simple questions to check
the influence of correlation over causality
11
![Page 12: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/12.jpg)
Does Ice cream consumption
leads to drowning ??
Ice cream consumption is positivey correlated with number of drowning people
12
![Page 13: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/13.jpg)
Do Pirates Stop Global Warming ??
13
No. of pirates are positivey correlated with Global Temperature
![Page 14: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/14.jpg)
Does Shoe Size increases Reading Ability??
14
Shoe Size is positivey correlated with Reading Ability
![Page 15: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/15.jpg)
Do Firemen cause Large Fire Damage??
15
Firemen are positivey correlated with amount of damage
![Page 16: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/16.jpg)
Does Nationality effect SAT Score??
16
SAT scores are positivey correlated with nationality
![Page 17: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/17.jpg)
Is Cholestrol level affected by Facebook??
17
Cholesterol level is correlated with Facebook invention
![Page 18: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/18.jpg)
Are bad movies made because of low sale of
newspapers??
18
Shyamalin bad movies production is correlated with Newspapers
![Page 19: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/19.jpg)
Can Internet Explorer effect Murder Rate??
19
Use of Internet explorer is correlated with murder Rate
![Page 20: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/20.jpg)
Can Mexican lemon imports effect highway deaths??
20
Mexican Lemon imports are correlated with Highway deaths
![Page 21: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/21.jpg)
The number of Nobel prizes won by a country (adjusting for population) correlates well with per capita chocolate consumption. 21
(New England Journal of Medicine)
Are noble prizes won by chocolate consumption??
![Page 22: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/22.jpg)
RealityCorrelation vs. Causation
• ‘‘The correlation between workers’ education levels and wages is strongly positive”
• Does this mean education “causes” higher wages?
• We don’t know for sure !
• Correlation tells us two variables are related BUT does not tell us why
22
![Page 23: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/23.jpg)
• Possibility 1
• Education improves skills and skilled workers
get better paying jobs
• Education causes wages to
• Possibility 2
• Individuals are born with quality A which is relevant for success in education and on the job
• Quality (NOT education) causes wages to
23
RealityCorrelation vs. Causation
![Page 24: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/24.jpg)
Correlation vs Causation
24
![Page 25: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/25.jpg)
Without proper interpretation, causation should not be assumed,
or even implied.
25
![Page 26: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/26.jpg)
Causal Research
• If the objective is to determine which variable might be causing a certain behavior (whether there is a cause and effect relationship between variables) causal research must be undertaken.
26
![Page 27: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/27.jpg)
Causal discovery
Which actions will have beneficial effects?
…your health?
…climate changes?… the economy?
What affects…
27
![Page 28: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/28.jpg)
Available data
• A lot of “observational” data.
Correlation Causality!
• Experiments are often needed, but:
• Costly
• Unethical
• Infeasible
28
![Page 29: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/29.jpg)
Establishing CausalityEstablishing Causality
• To establish whether two variables are causally related, that is, whether a change in the independent variable X results in a change in the dependent variable Y, you must establish:
• Time order: The cause must have occurred before the effect
• Co-variation (statistical association): Changes in the value of the independent variable must be accompanied by changes in the value of the dependent variable
• Rationale: There must be a logical and compelling explanation for why these two variables are related
• Non-spuriousness: It must be established that the independent variable X, and only X, was the cause of changes in the dependent variable Y; rival explanations must be ruled out.
29
![Page 30: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/30.jpg)
Establishing CausalityEstablishing Causality
•Note that it is never possible to prove causality, but only to show to what degree it is probable.
30
![Page 31: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/31.jpg)
Causation Possibilities
• A causes B.
• B causes A.
• A and B both partly cause each other.
• A and B are both caused by a third factor, C.
• The observed correlation was due purely to chance.
31
![Page 32: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/32.jpg)
Third or Missing Variable Problem
A relationship other than causal might exist between the two variables.
It is possible that there is some other variable or factor that is causing the outcome.
32
![Page 33: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/33.jpg)
Lung Cancer
Smoking Genetics
Coughing
AttentionDisorder
Allergy
Anxiety Peer Pressure
Yellow Fingers
Car Accident
Born an Even Day
Fatigue
Causal graph example
33
![Page 34: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/34.jpg)
A ? B
A
BA ->
BB =Temperature
A = log(Altitude)
34
![Page 35: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/35.jpg)
Best fit: A -> B
A -> B A <- B
35
![Page 36: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/36.jpg)
Linear case?
A -> B A <- B
• Linear function• Gaussian input• Gaussian noise
36
![Page 37: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/37.jpg)
Google Trends &
Google Correlate
37
![Page 38: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/38.jpg)
38
![Page 39: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/39.jpg)
39
![Page 40: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/40.jpg)
40
![Page 41: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/41.jpg)
Approach 1:Granger Causality
Prof. Clive W.J. Granger, recipient of the 2003 Nobel Prize in Economics
![Page 42: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/42.jpg)
History In the early 1960's, I was
considering a pair of related stochastic processes which were clearly inter-related and I wanted to know if this relationship could be broken down into a pair of one way relationships. It was suggested to me to look at a definition of causality proposed by a very famous mathematician, Norbert Weiner, so I adapted this definition (Wiener 1956) into a practical form and discussed it.
Applied economists found the definition understandable and useable and applications of it started to appear. However, several writers stated that "of course, this is not real causality, it is only Granger causality.“
Clive W. J. Granger
42
![Page 43: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/43.jpg)
Grangers Idea“If variables are cointegrated, the
relationship among them can be expressed as Error
Correction Mechanism (ECM)”.
43
![Page 44: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/44.jpg)
Granger Causality
• Suppose that we have three terms, Xt , Yt , and Wt , and that we first attempt to forecast Xt+1 using past terms of Xt and Wt (without Yt).
• We then try to forecast Xt+1 using past terms of Xt , Wt ,and Yt (with Yt).
• If the second forecast is found to be more successful, according to standard cost functions, then the past of Y appears to contain information helping in forecasting Xt+1 that is not in past Xt or Wt .
• In short, Yt would "Granger cause" Xt+1 if
• Yt occurs before Xt+1 ;
• it contains information useful in forecasting Xt+1 that is not found in a group of other appropriate variables.
44
![Page 45: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/45.jpg)
Vector Autoregression (VAR)Mathematical Definition
[Y]t = [A][Y]t-1 + … + [A’][Y]t-k + [e]t or
where: p = the number of variables be considered in the systemk = the number of lags be considered in the system[Y]t, [Y]t-1, …[Y]t-k = the 1x p vector of variables
[A], … and [A'] = the p x p matrices of coefficients to be estimated[e]t = a 1 x p vector of innovations that may be contemporaneously
correlated but are uncorrelated with their own lagged values and uncorrelated with all of the right-hand side variables.
pt
t
t
t
pkt
kt
kt
kt
ppppp
p
p
p
pt
t
t
t
ppppp
p
p
p
pt
t
t
t
e
e
e
e
Y
Y
Y
Y
AAAA
AAAA
AAAA
AAAA
Y
Y
Y
Y
AAAA
AAAA
AAAA
AAAA
Y
Y
Y
Y
......
...
...............
...
...
...
...
...
...
...............
...
...
...
...3
2
1
3
2
1
'3
'2
'1
'
3'
33'
32'
31'
2'
23'
22'
21'
1'
13'
12'
11'
1
31
21
11
321
3333231
2232221
1131211
3
2
1
45
![Page 46: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/46.jpg)
Vector Autoregression (VAR)Example
Consider a case in which the number of variables n is 2, the number of lags p is 1 and the constant term is suppressed. For concreteness, let the two variables be called money, mt and output, yt .
The structural equation will be:
yttttt
mttttt
ymyy
ymym
1221212
1121111
46
![Page 47: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/47.jpg)
Vector Autoregression (VAR)Example
Then, the reduced form is
ttt
ytmtttt
ym
ymm
1112111
21
1
211
21
221121
21
21111
11
1
11
47
ttt
ytmtttt
ym
ymy
2122121
2121
21
21
122221
21
11221
1
1
111
![Page 48: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/48.jpg)
Vector Autoregression (VAR)Example
Among the statistics computed from VARs are helpful in predicting Granger Causality.
Granger causality tests – which have been interpreted as testing, for example, the validity of the monetarist proposition that autonomous variations in the money supply have been a cause of output fluctuations.
48
![Page 49: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/49.jpg)
Vector Autoregression (VAR)Vector Autoregression (VAR)Granger CausalityGranger Causality
In a regression analysis, we deal with the dependence of one variable on other variables, but it does not necessarily imply causation.
In our GDP and M example, the often asked question is whether GDP M or M GDP. Since we have two variables, we are dealing with bilateral causality.
Given the previous GDP and M VAR equations:
yttttt
mttttt
ymmy
ymym
1221212
1121111
49
![Page 50: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/50.jpg)
Vector Autoregression (VAR)Granger Causality
We can distinguish four cases:
Unidirectional causality from M to GDP Unidirectional causality from GDP to M Feedback or bilateral causality Independence
Assumptions: Stationary variables for GDP and M Number of lag terms Error terms are uncorrelated – if it is, appropriate
transformation is necessary
50
![Page 51: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/51.jpg)
Vector Autoregression (VAR)Granger Causality – Estimation (t-test)
A variable, say mt is said to fail to Granger cause another variable, say yt, relative to an information set consisting of past m’s and y’s if: E[ yt | yt-1, mt-1, yt-2, mt-2, …] = E [yt | yt-1, yt-2, …].
mt does not Granger cause yt relative to an information set consisting of past m’s and y’s iff 21 = 0.yt does not Granger cause mt relative to an information set consisting of past m’s and y’s iff 12 = 0. In a bivariate case, as in our example, a t-test can be used to test
the null hypothesis that one variable does not Granger cause another variable. In higher order systems, an F-test is used.
tttt
tttt
ymy
ymm
2122121
1112111
51
![Page 52: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/52.jpg)
1. Regress current GDP on all lagged GDP terms but do not include the lagged M variable (restricted regression). From this, obtain the restricted residual sum of squares, RSSR.
2. Run the regression including the lagged M terms (unrestricted regression). Also get the residual sum of squares, RSSUR.
3. The null hypothesis is Ho: i = 0, that is, the lagged M terms do not belong in the regression.
5. If the computed F > critical F value at a chosen level of significance, we reject the null, in which case the lagged m belong in the regression. This is another way of saying that m causes y.
Vector Autoregression (VAR)Granger Causality – Estimation (F-test)
)/(
/)(
knRSS
mRSSRSSF
UR
URR
52
![Page 53: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/53.jpg)
Criticisms of Causality Tests Granger causality test, much used
in VAR modelling, however do not explain some aspects of the VAR:
• It does not give the sign of the effect, we do not know if it is positive or negative
• It does not show how long the effect lasts for.
• It does not provide evidence of whether this effect is direct or indirect.
53
![Page 54: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/54.jpg)
54
![Page 55: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/55.jpg)
Prof. Dr. Bernhard Schölkopf
Kun Zhang
Max Planck at centre, 1931
55
![Page 56: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/56.jpg)
Approach 2
“Distinguishing Causes from Effects using
Nonlinear Acyclic Causal Models”
Kun Zhang, Aapo Hyv¨arinen
![Page 57: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/57.jpg)
Background
• Model-based causal discovery assumes a generative model to explain the data generating process.
• If the assumed model is close to the true one, such methods could not only detect the causal relations, but also discover the form in which each variable is influenced by others.
• For example, • Granger causality assumes that effects must follow causes and that
the causal effects are linear (Granger,1980).
• If the data are generated by a linear acyclic causal model and at most one of the disturbances is Gaussian, independent component analysis (ICA) (Hyv¨arinen et al., 2001)can be exploited to discover the causal relations in a convenient way (Shimizu et al., 2006).
57
![Page 58: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/58.jpg)
Shortcomings
• Previous models were too restrictive for real-life problems.
If the assumed model is wrong, model-based causal discovery may give misleading results.
58
![Page 59: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/59.jpg)
Zhang Approach
In a large class of real-life problems, the following three effects usually exist.
1. The effect of the causes is usually nonlinear.
2. The final effect received by the target variable from all its causes contains some noise which is independent
from the causes.
3. Sensors or measurements may introduce nonlinear distortions into the observed values of the variables.
Assumption: Involved nonlinearities are invertible.
59
![Page 60: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/60.jpg)
Proposed Solution:
Each observed variable is non-linear function of its parents with additive noise, followed by non-linear distortion
If all non-linearities are invertible, conditions are given for causal relationship
Two-step method: Constrained nonlinear ICA followed by statistical independence tests, to distinguish the cause from the effect in the two-variable case
60
![Page 61: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/61.jpg)
Proposed Causal Model:
Xi = fi,2 { fi,1 (pai) + ei}
Non-linear Distortion
(Continuous and Invertible)
Non-linear transformation
(Not necessarily Invertible)
Noise Effect in transmission from pai to xi
First stage: a nonlinear transformation of its parents pai, denoted by fi,1(pai), plus some noise (or disturbance) ei (which is independent from pai). Second stage: a nonlinear distortion fi,2 is applied to the output of the first stage to produce xi.
61
![Page 62: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/62.jpg)
Zhang Approach
• Suppose the causal relation under examination is x1 → x2. If this causal relation holds, there exist nonlinear functions f2,2 and f2,1 such that
• e2 = f−1 2,2 (x2)−f2,1(x1) is independent from x1.
y1 = x1, y2 = g2(x2) − g1(x1).
• Use Multi-Layer perceptrons (MLP’s) to model the nonlinearities g1 and g2.
• Parameters in g1 and g2 are learned by making y1 and y2 as independent as possible.
62
![Page 63: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/63.jpg)
Multilayer Perceptron (MLP)• A multilayer perceptron (MLP) is
a feedforward artificial neural network model that maps sets of input data onto a set of appropriate outputs.
63
![Page 64: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/64.jpg)
Zhang Analysis
• y1 and y2 produced by the first step are the assumed cause and the estimated corresponding disturbance, respectively.
• In the second step, one needs to verify if they are independent.
• If y1 and y2 are independent, it implies x1 causes x2, and that g1 and g2 provide an estimate of f2,1 and f−1
2,2 , respectively.64
![Page 65: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/65.jpg)
Success !!
• Zhang approach solved the problem “CauseEffectPairs” in the Pot-luck challenge, and successfully identified causes from effects
• Earned Reward : 200$
65
![Page 66: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/66.jpg)
Approach 3
“Nonlinear causal discovery
with additive noise models”
Patrik O. Hoyer, Dominik Janzing, Joris Mooij, Jonas Peters, Bernhard Sch¨olkopf
![Page 67: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/67.jpg)
Claim:
“Non-linearities are a blessing rather than a curse” -- Hoyer
Idea:In reality, many causal relationships are non-linear.
How about generalizing Basic linear framework to non-linear models??
67
![Page 68: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/68.jpg)
Hoyer Approach
When causal relationships are nonlinear it typically helps break the symmetry between the observed variables and allows the identification of causal directions.
As Friedman and Nachman have pointed out, non-invertible functional relationships between the observed variables can provide clues to the generating causal model.
We show that the phenomenon is much more general; for nonlinear models with additive noise almost any nonlinearities (invertible or not) will typically yield identifiable models.
68
![Page 69: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/69.jpg)
Model:
xi := fi ( xpa(i) ) + ni
where
fi is an arbitrary function (possibly different for each i),
xpa(i) is a vector containing the elements xj such that there is an edge from j to i in the DAG G,
the noise variables ni may have arbitrary probability densities pni (ni),
69
Hoyer Approach
![Page 70: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/70.jpg)
Hoyer Model Estimation
Test whether x and y are statistically independent.
If not : Test whether a model
y := f(x)+n
is consistent with the data, simply by doing a nonlinear regression of y on x (to get an estimate f’ of f), calculating the corresponding residuals n’ = y - f(x),
and testing whether n’ is independent of x. If so, accept the model
y := f(x) + n;
if not, reject it.
Similarly test whether the reverse model x := g(y) + n fits the data
70
![Page 71: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/71.jpg)
Hoyer Test Results
the “Old Faithful” dataset
• Obtains a p-value of 0.5 for the (forward) model “current duration causes next interval length” and
• a p-value of 4:4*10-9 for the (backward) model “next interval length causes current duration”
71
![Page 72: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/72.jpg)
the “Abalone” dataset from the UCI ML repository
• The correct model “age causes length” leads to a p-value of 0.19, • The reverse model “length causes age” comes with p < 10-15
72
Hoyer Test Results
![Page 73: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/73.jpg)
Temperature Alitude Statistics
• The correct model “altitude causes temperature” leads to p = 0:017, • “Temperature causes altitude” can clearly be rejected (p = 8*10-15)
73
Hoyer Test Results
![Page 74: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/74.jpg)
Approach 4
“A Linear Non-Gaussian Acyclic
Model for Causal Discovery (LINGAM)”
Shohei Shimizu, Patrik O. Hoyer, Aapo Hyv¨arinen, Antti Kerminen
![Page 75: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/75.jpg)
Assumptions
1.Data Generating Process is Linear
2.No unobserved confounders
3.Disturbance variables have non-gaussian distribution of non-zero variances
Approach:
Use of Independent Component Analysis (ICA)----- called Linear Non-Gaussian Acyclic Model (LINGAM ) Analysis
“when working with continuous-valued data, a significant advantagecan be achieved by departing from the Gaussianity assumption”
75
![Page 76: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/76.jpg)
LINGAM Model• Linear Non-Gaussina Acyclic Model
• Data Generating process:
76
![Page 77: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/77.jpg)
LINGAM Idea
• Key to Solution :
Observed variables are linear functions of the disturbance variables, and the disturbance variables are mutually independent and non-Gaussian.
x = Bx+e,
x= Ae,
where A = (I−B)−1.
77
![Page 78: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/78.jpg)
LINGAM Algorithm
LINGAM can be briefly summarized as follows:
• First, use a standard ICA algorithm (e.g., FastICA algorithm) to obtain an estimate of the mixing matrix A (or equivalently of W),
• subsequently permute it and normalize it appropriately before using it to compute B containing the sought connection strengths bij.3
78
![Page 79: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/79.jpg)
LINGAM Algorithm(1)Given : m*n data matrix X (m<<n) where each column
contains one sample vector X.
(a) Subtract mean from each row of X
(b) Apply ICA to get X = A*S, where S contains independent components in its rows
(c) Note : W= A-1
(2)Find W1 where W1 contains NO zeros on main diagonal and is obtained by permutting rows of W.
(3) Divide each row of W1 by corresponding diagonal element to get W1` with all 1’s on main diagonal
79
![Page 80: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/80.jpg)
(4) Find B^ such that B^ = I – W~`
(5) To find causal order, find permutation matrix P of B^ which yields
B~ = P*B^*PT
B~ (close to strictly lower triangular) can be measured using
summation{i<=j} (Bij2)
80
LINGAM Algorithm
![Page 81: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/81.jpg)
Practical Experiments
Project
Detecting Covert Links in Instant Messaging (IM) Networks Using Flow Level
Log Data
81
![Page 82: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/82.jpg)
• Users sending Instant Messages (IM) to relay server
• Relay server forwards messages to corresponding users
• All packets contain source and destination IP addresses of user and server IP addresses only
82
Introduction
Scenario # 1
![Page 83: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/83.jpg)
83
Scenario # 2
Introduction
• Users may be communicating behind a proxy server
• Users behind proxy servers are visible in scenario#2.
![Page 84: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/84.jpg)
Data Set
• Yahoo! Messenger IM network.
• Data Set Details: • Area: New York City area.
• Time: 12am to 12am
• Data Set Files:
• Input Data File:
• User-to-server traffic traces.
• Ground Truth Data File:
• Record of the actual user-to-user connections.
84
![Page 85: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/85.jpg)
Data Set StatisticsTime Duration Users Messages Sessions
8-8:10a 10 mins 3,420 15,370 1,968
8-8:20a 20 mins 5,405 33,192 3,265
8-8:30a 30 mins 7,438 53,649 4,661
8-8:40a 40 mins 9,513 75,810 6,179
8-8:50a 50 mins 11,684 99,721 7,669
8-9a 60 mins 13,953 126,694 9,264
85
![Page 86: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/86.jpg)
Granger Causality
86
F-test statistics for Granger Causalty test
![Page 87: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/87.jpg)
Zhang Approach Results
87
Zhang results for talking and non-talking pairs for IM networks in Yahoo!
![Page 88: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/88.jpg)
Just for Knowledge• Classifier Tool
WEKA (Waikato Environment for Knowledge Analysis) -> popular suite of machine learning software written in Java, developed at the University of Waikato, New Zealand
88
WEKA Bird : Found in New Zealand,Vulnerable Species.
![Page 89: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/89.jpg)
WEKA
89
![Page 90: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/90.jpg)
Conclusion
90
![Page 91: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/91.jpg)
Given: A causes of B;To Prove: Is it must that A and
B are correlated??Result: YES or NO;
why?? Can you show??91
![Page 92: Correlation implies Causation ? Saad Saleh Team Lead, Wisnet Lab, SEECS saad.saleh@seecs.edu.pk.](https://reader035.fdocuments.us/reader035/viewer/2022062308/56649d6f5503460f94a516c8/html5/thumbnails/92.jpg)
92