Sta2 Report
-
Upload
trang-nguyen -
Category
Documents
-
view
222 -
download
0
Transcript of Sta2 Report
-
8/7/2019 Sta2 Report
1/23
HANOI UNIVERSITY
FACULTY OF MANAGEMENT AND TOURISM
-o0o-
STATISTICS FOR ECONOMICS
Is there any difference
in the number of
t d t t h ?
-
8/7/2019 Sta2 Report
2/23
t d t t h ?
Case study - ANOVA
-
8/7/2019 Sta2 Report
3/23
TABLE OF CONTENTS
Scenario ................................................................................................2
Methodology ........................................................................................3
Data collection ..................................................................................................... 3
Approach ..............................................................................................................4
Analysis and discussion .....................................................................4
Check the required condition ...............................................................................4
Normality ......................................................................................................... 4
Variances equality ............................................................................................4
Hypothesis testing ................................................................................................6
2.1. Testing block means ................................................................................. 6
2.2. Testing treatment means ........................................................................... 7
Discussion of finding ........................................................................................... 8
Limitation ............................................................................................9
-
8/7/2019 Sta2 Report
4/23
Case study - ANOVA
Scenario
In recent years, along with an increasing demand in human resources, a growingnumber of universities have plan to open new faculties as well as increase the number of
student admissions for these hot sectors. However, it is undeniable that the mismatch
between the number of students enrollment and teachers/lecturers quantity has large
effect on the quality of education and training. To be aware of this important issue, our
group decided to find out whether there are any differences in the number of students perteacher from 2005 to 2009 (particularly 2005, 2007 and 2009) by using statistical
technique (2-way ANOVA). The available data is blocked into six main regions in
Vietnam. After conducting the test, the result show that during this 6-year period, despite
the changes in both number of students and teachers, the number of students per teacher
is nearly the same, which lead to our conclusion that there is no difference among threeyears.
-
8/7/2019 Sta2 Report
5/23
Case study - ANOVA
Methodology
Data collection
As the problem objective is to test whether there are changes in the amount of
students per teacher in recent years in Viet Nam, to be more detail we conduct the test
over three years including 2005, 2007, and 2009. Moreover, the data type is quantitative;
we decided to use the analysis of variance. The data was collected from the Vietnam
General Statistics Office website (shown in Appendix E).
However, we pointed out that many other factors may affect to the result of our
test. As a result, the variability within the samples might be large. In order to reduce the
variation in each year, we made the survey according to blocks and then did the test.
Therefore, we took a random sample of six regions containing Red River delta, Northern
midlands & mountainous, Northern Central and Central Coastal, Highlands, South East,
and Mekong River delta to test the changes in the rate of student over one teacher in
those areas over three years. Nevertheless, because it was so difficult to conduct the
experiment on those areas, we continued using excel to select randomly one province in
each area to be on behalf of that region. And thereafter, we got the result of six provinces:
i h i d h l i i i h
-
8/7/2019 Sta2 Report
6/23
Case study - ANOVA
Approach
In order to indicate whether differences exist among the number of students overthe quantity of teachers over three years, it is necessary to check the required conditions
for using F-test of two-way ANOVA, which are the random variable is normally
distributes and the population variances are equal. We will check each condition one by
one.
Analysis and discussion
Check the required condition
Normality
As you can see from the histogram in Appendix D, the three populations are nonnormal, in order to use 2 way ANOVA, we assume that all of them are normally
distributed.
Variances equality
Since the best estimator of population variance is the sample variance, we applied
-
8/7/2019 Sta2 Report
7/23
Case study - ANOVA
After calculating the variance (shown inappendix A
), the largest variance is that
one in 2009 while the smallest one is in 2007, so we use F-test to make inference about
those two population variances
1. Testing hypothesis:
2
1
2
2
: 1O
H
=
2
1
2
2
: 1A
H
2. Test statistic:
2
1
2
2
s
Fs
=
is F-distributed with 1 11v n=
and 2 21v n=
3. Significance level: = 0.05
4. Decision rule:
/2 1 2 025 2 2 1 /2 1 2 025 2 2
-
8/7/2019 Sta2 Report
8/23
Case study - ANOVA
Hypothesis testing
2.1. Testing block means
1. Testing hypothesis:
Ho: Block means are all equal
Ha: At least tow block means differ
2. Test statistic:
MSBF
MSE=
is F-distributed with 1 = b 1 and 2 = n k b + 1
3. Significance level: = 0.05
4. Decision rule:
Reject Ho if F > F, b -1, n k b +1 = F.05, 5, 10 = 3.33
5. Value of test statistic:
As shown in the ANOVA table (Appendix C) F = 1.98995
-
8/7/2019 Sta2 Report
9/23
Case study - ANOVA
2.2. Testing treatment means
1. Testing hypothesis:
1 2 3:
OH = =
:A
HAt least 2 treatment means differ
2. Test statistic:
MSTF
MSE=
is F-distributed with 1 = k 1 and 2 = n k b +1
3. Significance level:
= 0.05
4. Decision rule:
Reject Ho ifF > F, k 1, n k b + 1= F.05, 2, 10 = 4.10
-
8/7/2019 Sta2 Report
10/23
Case study - ANOVA
Discussion of finding
It is obvious from the hypothesis tests that there is not enough evidence to rejectthe null hypothesis, which assumes that there is no difference between the ratios of
students/teacher in Vietnam over five year period. From the result extracted from the data
analysis section, there is also no difference among the block means representing the
population of six main regions in Vietnam. Therefore, it is quite easy to recognize the
balance state through these six areas.
If the test were not conducted, people may think that the ratio of students per
teacher increases over the years because of the student growth in Vietnam. The fact
shows that due to high demand in high quality human resource to meet challenges of
economic growth, many universities/colleges have increased the number of admission
year by year. To be aware of that fact, education units have had plan to recruit more
teachers to keep up with the increase in number of students and remain/improve teaching
quality. This fact somehow explains the reason for unchanged number of students per
teacher over the years. However, compared with the worlds standard (15-20
students/teacher) and the goal of Ministry of Education and Training (20
-
8/7/2019 Sta2 Report
11/23
Case study - ANOVA
Limitation
Although we tried to do test with our best effort, some limitations still happened.These following limitations can reduce our tests accuracy:
Lack of information: it is difficult to find information through out longer periods (5-year periods in stead of 1-year periods as we showedpreviously).The 1-year periods can be too short time so that this limitation canreflect inaccuracy in changing the number of professors. As a result, our
conclusions may be not much exactly.
Rejection regions: we chose = 0.05, which might lead to type II error.However, we believed that it is not affecting our result so much.
Time consuming: because checking consumptions are necessary for testing sowe spend lots of time to check the normality of populations and the equality ofits variance. Fortunately, histograms drawn resulting normally distributedpopulations as we expected. Moreover, we also check SSB to ensure that thereis no difference between blocks.
Normality: In order to follow the 2 way ANOVA test above, we have assumedthat the three populations are normally distributed.
-
8/7/2019 Sta2 Report
12/23
Case study - ANOVA
to next year, which means that it does not have strong influences on the quality of
teaching and studying. However, we still have some recommendation in order to improve
those qualities.
+ Reinforcing high qualified professors: since the number of students increases in
universities, it creates a lot of pressure on education. The lack of high qualified teachers
is inevasible. Therefore, reinforcing high qualified professors are the first principles.
+ Motivating teachers: the teachers should be facilitated studies with suitable
compensations. Beside, creating good relationships between teachers and their students
are respected also. Thus, that reduce a large number of teachers quit their jobs.
+ Changing from traditional classes to new model ones: let Hanoi University be
an example, the students and teachers attend at five lectures and five tutorials each week.
Consequently, the professors and their students have extra time for self-study.
+ Flexible time: both teachers and student as well can involve in the social
activities, voluntary event, and part-time jobs in order to gain practical experiences, soft
skills like communication skills. In addition, universities can provide enough facilities
and equipments for teaching.
-
8/7/2019 Sta2 Report
13/23
Case study - ANOVA
year in regions which we indicate above. It also means that Vietnamese university
education can provide enough teachers to meet the need of social in general and the
increase in enrolment target through years. However, we still need some recommendation
in order to improve the education system as shown in our report.
During the time we were conducting the research, some limitation occurred
which lead to inaccuracy result. In addition, because of the characteristic of ANOVA test
and time consuming, we can not show the whole picture of the issue for example, the
trend of enrolment target, change in method and model class, etc. If by any chance our
report has aroused interest in other researchers about the same topic, we hope that future
studies would be conducted on a larger time scale, with more detailed data, and with
further knowledge of statistic.
-
8/7/2019 Sta2 Report
14/23
Case study - ANOVA
Reference
General Statistic Office, Number of teachers, students in universities and colleges by
province, http://www.gso.gov.vn/default_en.aspx?tabid=474&idmid=3&ItemID=10207
http://vietbao.vn/Tuyen-sinh/Chi-tieu-tuyen-sinh-vao-cac-truong-DH-CD-nam-
2005/30050060/290/
http://vietbao.vn/Giao-duc/Chi-tieu-tuyen-sinh-cac-truong-DH-nam-2007/70079320/202/
http://www.gso.gov.vn/default_en.aspx?tabid=474&idmid=3&ItemID=10207http://vietbao.vn/Tuyen-sinh/Chi-tieu-tuyen-sinh-vao-cac-truong-DH-CD-nam-2005/30050060/290/http://vietbao.vn/Tuyen-sinh/Chi-tieu-tuyen-sinh-vao-cac-truong-DH-CD-nam-2005/30050060/290/http://vietbao.vn/Giao-duc/Chi-tieu-tuyen-sinh-cac-truong-DH-nam-2007/70079320/202/http://www.gso.gov.vn/default_en.aspx?tabid=474&idmid=3&ItemID=10207http://vietbao.vn/Tuyen-sinh/Chi-tieu-tuyen-sinh-vao-cac-truong-DH-CD-nam-2005/30050060/290/http://vietbao.vn/Tuyen-sinh/Chi-tieu-tuyen-sinh-vao-cac-truong-DH-CD-nam-2005/30050060/290/http://vietbao.vn/Giao-duc/Chi-tieu-tuyen-sinh-cac-truong-DH-nam-2007/70079320/202/ -
8/7/2019 Sta2 Report
15/23
Case study - ANOVA
Appendixes
Calculating the sample variance
SUMMARY Count Sum Average VarianceRed river delta 3 79.5843 26.5281 9.12889
Northern midlands and mountains areas 3 63.4919 21.164 110.341
Northern Central area and Central coastal area 3 105.621 35.2069 83.1804
Central highlands 3 75.6042 25.2014 179.928
South East 3 83.4234 27.8078 85.7486
Mekong river delta 3 34.4233 11.4744 10.9987
2005 6 149.567 24.9278 109.518
2007 6 138.567 23.0944 107.965
2009 6 154.015 25.6691 156.604
ii
-
8/7/2019 Sta2 Report
16/23
Case study - ANOVACheck the variances equality
F-Test Two-Sample for Variances Variable 1 Variable 2
Mean 23.09443724 25.66908638 Variance 107.9649341 156.6043958 Observations 6 6 df 5 5 F 0.6894119 P(F
-
8/7/2019 Sta2 Report
17/23
Case study - ANOVA
ANOVA (using Excel)
Source of Variation SS df MS F P-value F crit
Rows 932.8621 5 186.572 1.98995 0.16583 3.32583
Columns 21.07898 2 10.5395 0.11241 0.89479 4.10282
Error 937.5724 10 93.7572
Total 1891.514 17
iv
-
8/7/2019 Sta2 Report
18/23
Case study - ANOVA
D. Histograms
Population 1: 2005
Population 1 - 2005
0
1
2
34
10 20 30 40 50 More
Frequency
Population 2: 2007
Population 2 - 2007
0
1
2
3
10 20 30 40 50 More
Frequency
Bin Frequency
10 020 230 340 050 1
More 0
Bin Frequency
10 120 130 240 250 0
More 0
v
-
8/7/2019 Sta2 Report
19/23
Case study - ANOVA Population 3: 2009
Population 3 - 2009
0
0.5
1
1.5
2
2.5
10 20 30 40 50 More
Fr
equency
Bin
Frequenc
y
10 0
20 2
30 2
40 2
50 0
More 0
vi
-
8/7/2019 Sta2 Report
20/23
Case study - ANOVAE. Data from GSO
2007 2008 2009 Teacher Student S/T Teacher Student S/T Teacher Student S/T
Whole country61321
192843
6 60651
167570
0 65115
179617
4
Num Red river delta 25384 791671 25310 695089 26409 725976
1 H Ni16476 606207 36.793336 17065 529211
31.011485
18083 541671 29.954709
2 H Ty 1404 29435 20.9651
3 Vnh Phc536 17704 33.029851 568 18384
32.366197
646 19576 30.303406
4 Bc Ninh522 7624 14.605364 632 11676
18.474684
543 14530 26.758748
5 Qung Ninh896 8100 9.0401786 811 9272
11.432799
870 10277 11.812644
6 Hi Dng 761 9677 12.716163 848 13437 15.845519 876 13312 15.196347
7 Hi Phng1776 49913 28.104167 1862 51070
27.427497
1894 53857 28.435586
8 Hng Yn624 22875 36.658654 907 22195
24.470783
963 24067 24.991693
9 Thi Bnh621 8409 13.541063 612 7222
11.800654
613 8450 13.784666
10 H Nam118 3922 33.237288 268 3668
13.686567
315 4070 12.920635
11 Nam nh1517 27081 17.851681 1504 27590
18.34441
51372 34802 25.365889
12 Ninh Bnh133 724 5.443609 233 1364
5.8540773
234 1364 5.8290598
Northern midlands and moutain areas 4863 112385 5702 105105 5978 120033 1 H Giang 71 2134 30.056338 65 1001 15.4 71 1441 20.295775
vii
-
8/7/2019 Sta2 Report
21/23
Case study - ANOVA
2 Cao Bng107 1410 13.17757 110 1734
15.763636
97 1571 16.195876
3 Bc Kn212 2080 9.8113208 45 967
21.488889
45 688 15.288889
4 Tuyn Quang80 530 6.625 73 925
12.671233
73 905 12.39726
5 Lo Cai 97 1917 19.762887 81 1552 19.160494 81 714 8.8148148
6 Yn Bi70 829 11.842857 109 935
8.5779817
111 1264 11.387387
7 Thi Nguyn 2437 70666 28.997128 2929 69822 23.83817 3019 75433 24.986088
8 Lng Sn148 1252 8.4594595 166 883
5.3192771
166 3188 19.204819
9 Bc Giang228 3592 15.754386 223 2333
10.461883
244 3001 12.29918
10 Ph Th725 10519 14.508966 1112 9959
8.9559353
1031 13820 13.404462
11 Lai Chu124 2547 20.540323 187 2838 15.17647
1214 2869 13.406542
12 Sn La405 12687 31.325926 417 10226
24.522782
23 238 10.347826
13 Ha Bnh159 2222 13.974843 185 1930
10.432432
471 11706 24.853503
Northern Central area and Central
coastal area9601 316394 9640 268741 332 3195
1 Thanh Ha700 16646 23.78 808 15276
18.905941
10866 292413 26.910823
2 Ngh An 1282 41358 32.26053 1134 40293 35.531746 830 16022 19.303614
3 H Tnh162 1172 7.2345679 157 2555
16.273885
1325 39175 29.566038
4 Qung Bnh138 4889 35.427536 148 4952
33.459459
167 2854 17.08982
5 Qung Tr 78 1272 16.307692 79 1171 14.82278 148 5039 34.047297
viii
-
8/7/2019 Sta2 Report
22/23
Case study - ANOVA5
6 Tha Thin-Hu1952 97154 49.771516 2009 52141
25.953708
80 1246 15.575
7 Nng2394 79458 33.190476 2785 82229
29.525673
2076 56599 27.263487
8 Qung Nam650 3771 5.8015385 537 6984
13.00558
73135 90889 28.991707
9 Qung Ngi403 5553 13.779156 280 5769
20.603571
634 10616 16.744479
10 Bnh nh609 27751 45.568144 628 19825
31.568471
375 6270 16.72
11 Ph Yn329 4192 12.741641 241 4693
19.473029
696 22994 33.037356
12 Khnh Ha724 30423 42.020718 651 28795
44.231951
370 6287 16.991892
13 Ninh Thun54 847 15.685185 53 558
10.528302
852 30733 36.071596
14 Bnh Thun126 1908 15.142857 130 3500 26.92307
753 446 8.4150943
Central highlands 1853 54774 1178 45317 125 3243 1 Kon Tum 183 2206 12.054645 90 1539 17.1 1271 49400 38.8670342 Gia Lai 111 1163 10.477477 100 1415 14.15 190 2984 15.705263
3 k Lk 450 14021 31.157778 457 13278
29.054705
103 1570 15.242718
4 k Nng 565 8976 15.886726 491 15761 32.099796
5 Lm ng544 28408 52.220588 531 29085
54.774011
487 29085 59.722793
South East 15381 549900 13720 447998 15318 485285 1 Bnh Phc 97 766 7.8969072 109 952 8.733945 105 879 8.3714286
2 Ty Ninh84 805 9.5833333 77 662
8.5974026
77 904 11.74026
3 Bnh Dng761 20824 27.363995 527 13409
25.444023
883 15529 17.586636
4 ng Nai 759 19381 25.534914 607 19558 32.22075 684 25987 37.99269
ix
-
8/7/2019 Sta2 Report
23/23
Case study - ANOVA8
5 B Ra - Vng Tu251 5171 20.601594 335 7808
23.307463
304 7684 25.276316
6 TP. H Ch Minh13429 502953 37.452752 12065 405609
33.618649
13265 434302 32.740445
Mekong river delta 4239 103312 5101 113450 5273 123067
1 Long An 84 1295 15.416667 77 1309 17 161 3762 23.366462 Tin Giang 215 3622 16.846512 315 4940 15.68254 325 5879 18.089231
3 Bn Tre178 1506 8.4606742 170 1559
9.1705882
166 1803 10.861446
4 Tr Vinh216 5072 23.481481 413 5179
12.539952
472 5535 11.726695
5 Vnh Long572 12563 21.963287 853 12834
15.045721
469 14212 30.302772
6 ng Thp438 15400 35.159817 344 10785
31.351744
412 12321 29.90534
7 An Giang 384 8327 21.684896 482 836017.34439
8 514 10767 20.947471
8 Kin Giang331 2766 8.3564955 384 3226
8.4010417
380 4221 11.107895
9 Cn Th 1523 47008 30.865397 1662 57411
34.543321
1816 53766 29.606828
10 Hu Giang 43 797 18.534884 48 1326 27.625 126 3625 28.769841
11 Sc Trng105 2097 19.971429 156 2784
17.846154
171 2989 17.479532
12 Bc Liu101 2083 20.623762 101 2557
25.316832
170 2546 14.976471
13 C Mau 49 776 15.836735 96 1180 12.291667 91 1641 18.032967
x