Sta2 Report

download Sta2 Report

of 23

Transcript of Sta2 Report

  • 8/7/2019 Sta2 Report

    1/23

    HANOI UNIVERSITY

    FACULTY OF MANAGEMENT AND TOURISM

    -o0o-

    STATISTICS FOR ECONOMICS

    Is there any difference

    in the number of

    t d t t h ?

  • 8/7/2019 Sta2 Report

    2/23

    t d t t h ?

    Case study - ANOVA

  • 8/7/2019 Sta2 Report

    3/23

    TABLE OF CONTENTS

    Scenario ................................................................................................2

    Methodology ........................................................................................3

    Data collection ..................................................................................................... 3

    Approach ..............................................................................................................4

    Analysis and discussion .....................................................................4

    Check the required condition ...............................................................................4

    Normality ......................................................................................................... 4

    Variances equality ............................................................................................4

    Hypothesis testing ................................................................................................6

    2.1. Testing block means ................................................................................. 6

    2.2. Testing treatment means ........................................................................... 7

    Discussion of finding ........................................................................................... 8

    Limitation ............................................................................................9

  • 8/7/2019 Sta2 Report

    4/23

    Case study - ANOVA

    Scenario

    In recent years, along with an increasing demand in human resources, a growingnumber of universities have plan to open new faculties as well as increase the number of

    student admissions for these hot sectors. However, it is undeniable that the mismatch

    between the number of students enrollment and teachers/lecturers quantity has large

    effect on the quality of education and training. To be aware of this important issue, our

    group decided to find out whether there are any differences in the number of students perteacher from 2005 to 2009 (particularly 2005, 2007 and 2009) by using statistical

    technique (2-way ANOVA). The available data is blocked into six main regions in

    Vietnam. After conducting the test, the result show that during this 6-year period, despite

    the changes in both number of students and teachers, the number of students per teacher

    is nearly the same, which lead to our conclusion that there is no difference among threeyears.

  • 8/7/2019 Sta2 Report

    5/23

    Case study - ANOVA

    Methodology

    Data collection

    As the problem objective is to test whether there are changes in the amount of

    students per teacher in recent years in Viet Nam, to be more detail we conduct the test

    over three years including 2005, 2007, and 2009. Moreover, the data type is quantitative;

    we decided to use the analysis of variance. The data was collected from the Vietnam

    General Statistics Office website (shown in Appendix E).

    However, we pointed out that many other factors may affect to the result of our

    test. As a result, the variability within the samples might be large. In order to reduce the

    variation in each year, we made the survey according to blocks and then did the test.

    Therefore, we took a random sample of six regions containing Red River delta, Northern

    midlands & mountainous, Northern Central and Central Coastal, Highlands, South East,

    and Mekong River delta to test the changes in the rate of student over one teacher in

    those areas over three years. Nevertheless, because it was so difficult to conduct the

    experiment on those areas, we continued using excel to select randomly one province in

    each area to be on behalf of that region. And thereafter, we got the result of six provinces:

    i h i d h l i i i h

  • 8/7/2019 Sta2 Report

    6/23

    Case study - ANOVA

    Approach

    In order to indicate whether differences exist among the number of students overthe quantity of teachers over three years, it is necessary to check the required conditions

    for using F-test of two-way ANOVA, which are the random variable is normally

    distributes and the population variances are equal. We will check each condition one by

    one.

    Analysis and discussion

    Check the required condition

    Normality

    As you can see from the histogram in Appendix D, the three populations are nonnormal, in order to use 2 way ANOVA, we assume that all of them are normally

    distributed.

    Variances equality

    Since the best estimator of population variance is the sample variance, we applied

  • 8/7/2019 Sta2 Report

    7/23

    Case study - ANOVA

    After calculating the variance (shown inappendix A

    ), the largest variance is that

    one in 2009 while the smallest one is in 2007, so we use F-test to make inference about

    those two population variances

    1. Testing hypothesis:

    2

    1

    2

    2

    : 1O

    H

    =

    2

    1

    2

    2

    : 1A

    H

    2. Test statistic:

    2

    1

    2

    2

    s

    Fs

    =

    is F-distributed with 1 11v n=

    and 2 21v n=

    3. Significance level: = 0.05

    4. Decision rule:

    /2 1 2 025 2 2 1 /2 1 2 025 2 2

  • 8/7/2019 Sta2 Report

    8/23

    Case study - ANOVA

    Hypothesis testing

    2.1. Testing block means

    1. Testing hypothesis:

    Ho: Block means are all equal

    Ha: At least tow block means differ

    2. Test statistic:

    MSBF

    MSE=

    is F-distributed with 1 = b 1 and 2 = n k b + 1

    3. Significance level: = 0.05

    4. Decision rule:

    Reject Ho if F > F, b -1, n k b +1 = F.05, 5, 10 = 3.33

    5. Value of test statistic:

    As shown in the ANOVA table (Appendix C) F = 1.98995

  • 8/7/2019 Sta2 Report

    9/23

    Case study - ANOVA

    2.2. Testing treatment means

    1. Testing hypothesis:

    1 2 3:

    OH = =

    :A

    HAt least 2 treatment means differ

    2. Test statistic:

    MSTF

    MSE=

    is F-distributed with 1 = k 1 and 2 = n k b +1

    3. Significance level:

    = 0.05

    4. Decision rule:

    Reject Ho ifF > F, k 1, n k b + 1= F.05, 2, 10 = 4.10

  • 8/7/2019 Sta2 Report

    10/23

    Case study - ANOVA

    Discussion of finding

    It is obvious from the hypothesis tests that there is not enough evidence to rejectthe null hypothesis, which assumes that there is no difference between the ratios of

    students/teacher in Vietnam over five year period. From the result extracted from the data

    analysis section, there is also no difference among the block means representing the

    population of six main regions in Vietnam. Therefore, it is quite easy to recognize the

    balance state through these six areas.

    If the test were not conducted, people may think that the ratio of students per

    teacher increases over the years because of the student growth in Vietnam. The fact

    shows that due to high demand in high quality human resource to meet challenges of

    economic growth, many universities/colleges have increased the number of admission

    year by year. To be aware of that fact, education units have had plan to recruit more

    teachers to keep up with the increase in number of students and remain/improve teaching

    quality. This fact somehow explains the reason for unchanged number of students per

    teacher over the years. However, compared with the worlds standard (15-20

    students/teacher) and the goal of Ministry of Education and Training (20

  • 8/7/2019 Sta2 Report

    11/23

    Case study - ANOVA

    Limitation

    Although we tried to do test with our best effort, some limitations still happened.These following limitations can reduce our tests accuracy:

    Lack of information: it is difficult to find information through out longer periods (5-year periods in stead of 1-year periods as we showedpreviously).The 1-year periods can be too short time so that this limitation canreflect inaccuracy in changing the number of professors. As a result, our

    conclusions may be not much exactly.

    Rejection regions: we chose = 0.05, which might lead to type II error.However, we believed that it is not affecting our result so much.

    Time consuming: because checking consumptions are necessary for testing sowe spend lots of time to check the normality of populations and the equality ofits variance. Fortunately, histograms drawn resulting normally distributedpopulations as we expected. Moreover, we also check SSB to ensure that thereis no difference between blocks.

    Normality: In order to follow the 2 way ANOVA test above, we have assumedthat the three populations are normally distributed.

  • 8/7/2019 Sta2 Report

    12/23

    Case study - ANOVA

    to next year, which means that it does not have strong influences on the quality of

    teaching and studying. However, we still have some recommendation in order to improve

    those qualities.

    + Reinforcing high qualified professors: since the number of students increases in

    universities, it creates a lot of pressure on education. The lack of high qualified teachers

    is inevasible. Therefore, reinforcing high qualified professors are the first principles.

    + Motivating teachers: the teachers should be facilitated studies with suitable

    compensations. Beside, creating good relationships between teachers and their students

    are respected also. Thus, that reduce a large number of teachers quit their jobs.

    + Changing from traditional classes to new model ones: let Hanoi University be

    an example, the students and teachers attend at five lectures and five tutorials each week.

    Consequently, the professors and their students have extra time for self-study.

    + Flexible time: both teachers and student as well can involve in the social

    activities, voluntary event, and part-time jobs in order to gain practical experiences, soft

    skills like communication skills. In addition, universities can provide enough facilities

    and equipments for teaching.

  • 8/7/2019 Sta2 Report

    13/23

    Case study - ANOVA

    year in regions which we indicate above. It also means that Vietnamese university

    education can provide enough teachers to meet the need of social in general and the

    increase in enrolment target through years. However, we still need some recommendation

    in order to improve the education system as shown in our report.

    During the time we were conducting the research, some limitation occurred

    which lead to inaccuracy result. In addition, because of the characteristic of ANOVA test

    and time consuming, we can not show the whole picture of the issue for example, the

    trend of enrolment target, change in method and model class, etc. If by any chance our

    report has aroused interest in other researchers about the same topic, we hope that future

    studies would be conducted on a larger time scale, with more detailed data, and with

    further knowledge of statistic.

  • 8/7/2019 Sta2 Report

    14/23

    Case study - ANOVA

    Reference

    General Statistic Office, Number of teachers, students in universities and colleges by

    province, http://www.gso.gov.vn/default_en.aspx?tabid=474&idmid=3&ItemID=10207

    http://vietbao.vn/Tuyen-sinh/Chi-tieu-tuyen-sinh-vao-cac-truong-DH-CD-nam-

    2005/30050060/290/

    http://vietbao.vn/Giao-duc/Chi-tieu-tuyen-sinh-cac-truong-DH-nam-2007/70079320/202/

    http://www.gso.gov.vn/default_en.aspx?tabid=474&idmid=3&ItemID=10207http://vietbao.vn/Tuyen-sinh/Chi-tieu-tuyen-sinh-vao-cac-truong-DH-CD-nam-2005/30050060/290/http://vietbao.vn/Tuyen-sinh/Chi-tieu-tuyen-sinh-vao-cac-truong-DH-CD-nam-2005/30050060/290/http://vietbao.vn/Giao-duc/Chi-tieu-tuyen-sinh-cac-truong-DH-nam-2007/70079320/202/http://www.gso.gov.vn/default_en.aspx?tabid=474&idmid=3&ItemID=10207http://vietbao.vn/Tuyen-sinh/Chi-tieu-tuyen-sinh-vao-cac-truong-DH-CD-nam-2005/30050060/290/http://vietbao.vn/Tuyen-sinh/Chi-tieu-tuyen-sinh-vao-cac-truong-DH-CD-nam-2005/30050060/290/http://vietbao.vn/Giao-duc/Chi-tieu-tuyen-sinh-cac-truong-DH-nam-2007/70079320/202/
  • 8/7/2019 Sta2 Report

    15/23

    Case study - ANOVA

    Appendixes

    Calculating the sample variance

    SUMMARY Count Sum Average VarianceRed river delta 3 79.5843 26.5281 9.12889

    Northern midlands and mountains areas 3 63.4919 21.164 110.341

    Northern Central area and Central coastal area 3 105.621 35.2069 83.1804

    Central highlands 3 75.6042 25.2014 179.928

    South East 3 83.4234 27.8078 85.7486

    Mekong river delta 3 34.4233 11.4744 10.9987

    2005 6 149.567 24.9278 109.518

    2007 6 138.567 23.0944 107.965

    2009 6 154.015 25.6691 156.604

    ii

  • 8/7/2019 Sta2 Report

    16/23

    Case study - ANOVACheck the variances equality

    F-Test Two-Sample for Variances Variable 1 Variable 2

    Mean 23.09443724 25.66908638 Variance 107.9649341 156.6043958 Observations 6 6 df 5 5 F 0.6894119 P(F

  • 8/7/2019 Sta2 Report

    17/23

    Case study - ANOVA

    ANOVA (using Excel)

    Source of Variation SS df MS F P-value F crit

    Rows 932.8621 5 186.572 1.98995 0.16583 3.32583

    Columns 21.07898 2 10.5395 0.11241 0.89479 4.10282

    Error 937.5724 10 93.7572

    Total 1891.514 17

    iv

  • 8/7/2019 Sta2 Report

    18/23

    Case study - ANOVA

    D. Histograms

    Population 1: 2005

    Population 1 - 2005

    0

    1

    2

    34

    10 20 30 40 50 More

    Frequency

    Population 2: 2007

    Population 2 - 2007

    0

    1

    2

    3

    10 20 30 40 50 More

    Frequency

    Bin Frequency

    10 020 230 340 050 1

    More 0

    Bin Frequency

    10 120 130 240 250 0

    More 0

    v

  • 8/7/2019 Sta2 Report

    19/23

    Case study - ANOVA Population 3: 2009

    Population 3 - 2009

    0

    0.5

    1

    1.5

    2

    2.5

    10 20 30 40 50 More

    Fr

    equency

    Bin

    Frequenc

    y

    10 0

    20 2

    30 2

    40 2

    50 0

    More 0

    vi

  • 8/7/2019 Sta2 Report

    20/23

    Case study - ANOVAE. Data from GSO

    2007 2008 2009 Teacher Student S/T Teacher Student S/T Teacher Student S/T

    Whole country61321

    192843

    6 60651

    167570

    0 65115

    179617

    4

    Num Red river delta 25384 791671 25310 695089 26409 725976

    1 H Ni16476 606207 36.793336 17065 529211

    31.011485

    18083 541671 29.954709

    2 H Ty 1404 29435 20.9651

    3 Vnh Phc536 17704 33.029851 568 18384

    32.366197

    646 19576 30.303406

    4 Bc Ninh522 7624 14.605364 632 11676

    18.474684

    543 14530 26.758748

    5 Qung Ninh896 8100 9.0401786 811 9272

    11.432799

    870 10277 11.812644

    6 Hi Dng 761 9677 12.716163 848 13437 15.845519 876 13312 15.196347

    7 Hi Phng1776 49913 28.104167 1862 51070

    27.427497

    1894 53857 28.435586

    8 Hng Yn624 22875 36.658654 907 22195

    24.470783

    963 24067 24.991693

    9 Thi Bnh621 8409 13.541063 612 7222

    11.800654

    613 8450 13.784666

    10 H Nam118 3922 33.237288 268 3668

    13.686567

    315 4070 12.920635

    11 Nam nh1517 27081 17.851681 1504 27590

    18.34441

    51372 34802 25.365889

    12 Ninh Bnh133 724 5.443609 233 1364

    5.8540773

    234 1364 5.8290598

    Northern midlands and moutain areas 4863 112385 5702 105105 5978 120033 1 H Giang 71 2134 30.056338 65 1001 15.4 71 1441 20.295775

    vii

  • 8/7/2019 Sta2 Report

    21/23

    Case study - ANOVA

    2 Cao Bng107 1410 13.17757 110 1734

    15.763636

    97 1571 16.195876

    3 Bc Kn212 2080 9.8113208 45 967

    21.488889

    45 688 15.288889

    4 Tuyn Quang80 530 6.625 73 925

    12.671233

    73 905 12.39726

    5 Lo Cai 97 1917 19.762887 81 1552 19.160494 81 714 8.8148148

    6 Yn Bi70 829 11.842857 109 935

    8.5779817

    111 1264 11.387387

    7 Thi Nguyn 2437 70666 28.997128 2929 69822 23.83817 3019 75433 24.986088

    8 Lng Sn148 1252 8.4594595 166 883

    5.3192771

    166 3188 19.204819

    9 Bc Giang228 3592 15.754386 223 2333

    10.461883

    244 3001 12.29918

    10 Ph Th725 10519 14.508966 1112 9959

    8.9559353

    1031 13820 13.404462

    11 Lai Chu124 2547 20.540323 187 2838 15.17647

    1214 2869 13.406542

    12 Sn La405 12687 31.325926 417 10226

    24.522782

    23 238 10.347826

    13 Ha Bnh159 2222 13.974843 185 1930

    10.432432

    471 11706 24.853503

    Northern Central area and Central

    coastal area9601 316394 9640 268741 332 3195

    1 Thanh Ha700 16646 23.78 808 15276

    18.905941

    10866 292413 26.910823

    2 Ngh An 1282 41358 32.26053 1134 40293 35.531746 830 16022 19.303614

    3 H Tnh162 1172 7.2345679 157 2555

    16.273885

    1325 39175 29.566038

    4 Qung Bnh138 4889 35.427536 148 4952

    33.459459

    167 2854 17.08982

    5 Qung Tr 78 1272 16.307692 79 1171 14.82278 148 5039 34.047297

    viii

  • 8/7/2019 Sta2 Report

    22/23

    Case study - ANOVA5

    6 Tha Thin-Hu1952 97154 49.771516 2009 52141

    25.953708

    80 1246 15.575

    7 Nng2394 79458 33.190476 2785 82229

    29.525673

    2076 56599 27.263487

    8 Qung Nam650 3771 5.8015385 537 6984

    13.00558

    73135 90889 28.991707

    9 Qung Ngi403 5553 13.779156 280 5769

    20.603571

    634 10616 16.744479

    10 Bnh nh609 27751 45.568144 628 19825

    31.568471

    375 6270 16.72

    11 Ph Yn329 4192 12.741641 241 4693

    19.473029

    696 22994 33.037356

    12 Khnh Ha724 30423 42.020718 651 28795

    44.231951

    370 6287 16.991892

    13 Ninh Thun54 847 15.685185 53 558

    10.528302

    852 30733 36.071596

    14 Bnh Thun126 1908 15.142857 130 3500 26.92307

    753 446 8.4150943

    Central highlands 1853 54774 1178 45317 125 3243 1 Kon Tum 183 2206 12.054645 90 1539 17.1 1271 49400 38.8670342 Gia Lai 111 1163 10.477477 100 1415 14.15 190 2984 15.705263

    3 k Lk 450 14021 31.157778 457 13278

    29.054705

    103 1570 15.242718

    4 k Nng 565 8976 15.886726 491 15761 32.099796

    5 Lm ng544 28408 52.220588 531 29085

    54.774011

    487 29085 59.722793

    South East 15381 549900 13720 447998 15318 485285 1 Bnh Phc 97 766 7.8969072 109 952 8.733945 105 879 8.3714286

    2 Ty Ninh84 805 9.5833333 77 662

    8.5974026

    77 904 11.74026

    3 Bnh Dng761 20824 27.363995 527 13409

    25.444023

    883 15529 17.586636

    4 ng Nai 759 19381 25.534914 607 19558 32.22075 684 25987 37.99269

    ix

  • 8/7/2019 Sta2 Report

    23/23

    Case study - ANOVA8

    5 B Ra - Vng Tu251 5171 20.601594 335 7808

    23.307463

    304 7684 25.276316

    6 TP. H Ch Minh13429 502953 37.452752 12065 405609

    33.618649

    13265 434302 32.740445

    Mekong river delta 4239 103312 5101 113450 5273 123067

    1 Long An 84 1295 15.416667 77 1309 17 161 3762 23.366462 Tin Giang 215 3622 16.846512 315 4940 15.68254 325 5879 18.089231

    3 Bn Tre178 1506 8.4606742 170 1559

    9.1705882

    166 1803 10.861446

    4 Tr Vinh216 5072 23.481481 413 5179

    12.539952

    472 5535 11.726695

    5 Vnh Long572 12563 21.963287 853 12834

    15.045721

    469 14212 30.302772

    6 ng Thp438 15400 35.159817 344 10785

    31.351744

    412 12321 29.90534

    7 An Giang 384 8327 21.684896 482 836017.34439

    8 514 10767 20.947471

    8 Kin Giang331 2766 8.3564955 384 3226

    8.4010417

    380 4221 11.107895

    9 Cn Th 1523 47008 30.865397 1662 57411

    34.543321

    1816 53766 29.606828

    10 Hu Giang 43 797 18.534884 48 1326 27.625 126 3625 28.769841

    11 Sc Trng105 2097 19.971429 156 2784

    17.846154

    171 2989 17.479532

    12 Bc Liu101 2083 20.623762 101 2557

    25.316832

    170 2546 14.976471

    13 C Mau 49 776 15.836735 96 1180 12.291667 91 1641 18.032967

    x