Quantitative data analysis Lecture 6
-
Upload
keelie-hays -
Category
Documents
-
view
40 -
download
2
description
Transcript of Quantitative data analysis Lecture 6
www.handels.gu.se
E 45Johan Brink, IIE24 November
Quantitative data analysisLecture 6
23-04-19
www.handels.gu.se
Agenda
Chapter 14Univariate analysisBivariate analysisMultivariate analysis
Contingency Pearson’s correlationt-testChi square
Factor & cluster analysis
www.handels.gu.se
Univariate analysis
One variable at a time• Frequency tables – Bar charts• Grouping of ration & interval
variables: 20-29, 30-39… - Histograms
Arithmetic mean= Sum of all values/ # Values
=33.6Median= Mid point of distribution of
valuesMode=the most frequent value in the
distribution
0
5
10
15
20
25
30
35
40
-20 21-30 31-40 41-50 51-
Ages of gymvisitors
Interval/ Ratio – Scale & distance
Ordinal -ranked
Nominal –can’t be ranked
Mean Yes No -used anyhow
No
Median
Yes Yes No
Mode Yes Yes Yes
www.handels.gu.se
DispersionRange: Min to Max (7 & 3)
Variance =standard deviation²
s²= Σ (x-M)²/ (n-1) = 1,5->Standard deviation =1,22
Point (x-M)²
1 5 0
2 5 0
3 6 10
4 5 0
5 4 1
6 4 1
7 7 4
8 3 4
9 6 1
Sum 45 12
Variance = 12/(9-1) =1,5
www.handels.gu.se
Spurious correlationRelationship between two variables are caused by a thirds, underlying factor
Intervening variableChain of relationships
Moderating variableThe relationship between A & B only exist if C is percent
Multivariate analysis
A B
C
A B C
A B
C
www.handels.gu.se
Analyzing data
Correlation & relationshipBetween variables (questions, groups, items)Does the answers on question 1 correlate
with answers on question 2?- Different questions/items for the same
constructor does it capture a relationship
Test – significant differencesBetween variables
-Questions, groups items -Across time/treatments
Is the mean different enough given the standard deviations?
t-test (Chi-squared (nominal scales)
Differences from the expected value?
www.handels.gu.se
Constructs & ItemsVariable 1
Variable 2
Variable 3
Variable 4
Variable 5
Variable 6
• Cronbach α is a measure of how well variables measures the same underlying phenomena
Construct 1 Construct 2
Item Question 1
Item Question 2
Item Question 3
Item Question 4
Item Question 5
Item Question 6
Org. Culture Performance
www.handels.gu.se
Hypothesis testing
H0There is no difference between group A and group B
H1There is a difference between group A and B
H0There is no connection between variable X and Y
H1There is a connection between variable X and Y
Real relationship (unknown)
H0 true H0 false
Result of statistical test
H0 rejected
Type 1 error
Correct
H0 not rejected
Correct Type 2 error
In order to reduce the risk of type 1 error, by increasing the level of significance from 5% to 1%,the risk of committing type 2 error increases!
www.handels.gu.se
Bivariate analysis
Nominal Ordinal Interval/ration
Dichotomus
Nominal – Can’t be ranked
Contingency table, Chi square, Cramer’s V
Contingency table, Chi square, Cramer’s V
Contingency table, Chi square, Cramer’s V
Contingency table, Chi square, Cramer’s V
Ordinal -Ranked
Contingency table, Chi square, Cramer’s V
Spearman’s rho
Spearman’s rho
Spearman’s rho
Interval/ratio –scale & distance
Contingency table, Chi square, Cramer’s V
Spearman’s rho
Pearson’s r Spearman’s rho
Dichotomus –Yes/No
Contingency table, Chi square, Cramer’s V
Spearman’s rho
Spearman’s rho
Phi
www.handels.gu.se
Contingency tables
Reasons
Male Female
# % # %
Relaxation
3 7 6 13
Fitness 15 36 16 33
Lose weight
8 19 25 52
Build strength
16 38 1 2
Total 42 100 48 100
www.handels.gu.se
Pearson's correlation
For interval/ratio variablesMeasure of the strength of
association between two variables
r = Between -1 and +1, 0= no correlation & 1= perfect correlation
r²*100% = Variation caused /explained
X Y
1 2
2 3
3 4
4 5
5 5
5 6
7 7
7 8
8 9
9 8
R=0,969
www.handels.gu.se
Pearson's correlation
www.handels.gu.se
Pearson's correlation
www.handels.gu.se
Pearson's correlation
www.handels.gu.se
t-test: A statistical test to see if there is a difference between two samples
n1=25 n2=24 Mean1=64 Mean2=56S1=10 S2=8Df=n1+n2-2=47
t= (Mean1-Mean2)/√[((n1-1)s1²+ (n2-1)s2²)/(n1+n2-2))*(1/n1+1/n2))]
t= (641-56)/√[((25-1)10²+ (24-1)8²)/(25+24-2))*(1/25+1/24))] =3,08
Statistic table t(df=47, 0,05)=2,0123,08>2,012 thus reject H0, there is a
difference!
Hypothesis: H0, the true means is equal
Alternative, H1, there is a difference
www.handels.gu.se
Chi square
Is there a difference between age groups (young, middle and old) and preference for A or B?
• Nominal scales!• 40 respondents
Chi²= Σ (observed-expected)²/expected
www.handels.gu.se
Chi square
Young Middle age
Old Σ
A 20 5 0 25
B 5 5 5 15
Σ 25 10 5 40
www.handels.gu.se
Chi square
Young Middle age
Old Σ
A 20 (15,625)
5 (6,25) 0 (3,125)
25
B 5 (9,375)
5(3,75) 5 (1,875)
15
Σ 25 10 5 40
www.handels.gu.se
Chi square
Chi²= (20-15,625)²/15,625+ (5-6,25)²/6,25+ (0-3,125)²/3,125+ (5-9,375)²/9,375+ (5-3,75)²/3,75+ (5-1,875)²/1,875=12,27
Df= (r-1)*(c-1)= 2-1*3-1=2
Chi²0,05, df=2 =>5,99
12,27> 5,99
Thus reject H0, there is a difference between A and B and age group.
www.handels.gu.se
Factor analysis and Cluster analysis• Reduce the data• Variables which measures the same thing• Underlying factors