Data management isu
-
Upload
rooney-page -
Category
Documents
-
view
17 -
download
0
description
Transcript of Data management isu
Competition to get into a Post-
Secondary institution has greatly
increased and the change in the
school curriculum has, and will
continue to, affect the graduates
of 2003.
54 % are females, 46 % are males
54 % are in Grade 12, 46 % are in OAC
The majority of students surveyed want to go to University
College is the second highest choice
More females want to go to University than males
Most people applied to only 3 Universities
Of those people, the majority were males - Females typically applied to more
than 3 Universities
75 % of those getting 90 - 99% are females
75 % of those who have a 90 - 99% average are Gr. 12’s
48 % of the students who have an 80 - 89 % average are OAC’s
Everyone who wants to go to college next year has an average of 70 to 79 %
All students with an average higher than 90 %, and most of those with an 80 - 89 % average,
want to attend University
Most OAC’s fall into this group
Sampling Technique:
My survey was basically based on a convenience level since I asked the students in my classes if they would respond to my survey.
Bias:
Since I received most of my data from this survey through my classes it has a bit of bias. This is because a large number of the students are taking University courses. It is also based on OAC ’s and Gr.12 ’s only. This could be classified as a Measurement bias since this method underestimated some characteristics of the population. Therefore, the results seemed to lean towards those who are interested in University and not so much those who may be working instead of attending a post-secondary institution.
Frequency Distribution Tableand Weighted Means
Medians and Modes
Standard Deviations
Z-scores
Percentiles
Weighted Mean = 67.5 %
0-49 % 23 24.5
50-59 % 30 54.5
60-69 % 68 64.5
70-79 % 79 74.5
80-89 % 51 84.5
90-99 % 6 94.5
Marks Freq. Mid Pt.0-49 % 16 24.5
50-59 % 24 54.5
60-69 % 49 64.5
70-79 % 63 74.5
80-89 % 62 84.5
90-99 % 15 94.5
Marks Freq. Mid Pt.
GRADE 12 OAC
Weighted Mean = 70.8 %
GRADE 12
Median:
257 / 2 = 129th position
= 70 - 79 % Mode:
Most students are part of the 70 - 79 % range in both grades. In Grade 12 there are 79 students (or 30.8 %) in this range and in OAC there are 63 students (or 27.5 %) with this average.
* MMR stats. from January 2003
OAC
Median:
229 / 2 = 115th position
= 70 - 79 %
Grade 12 = 16.6 OAC = 16.7
This shows that the Grade 12’s averages are slightly less spread than the OAC marks.
According to the Binomial Distribution graph, 68 % of my data should lie within one standard deviation of my mean.
Grade 12: OAC:
68.5 +/- 16.6 70.8 +/- 16.7
= 51.9 % to 85.1 % = 54.1 % to 87.5 %
Therefore 68 % of the students under each grade fall into these ranges
Student A
z = 78 - 67.5
16.6
= 0.4311
Scenario:
A Grade 12 student (“Student A”) and an OAC student (“Student B”) both receive a final average mark of 78 % in Math. Their grade averages are 67.5 % and 70.8 % and the standard deviations are 16.6 and 16.7.
Student B
z = 78 - 70.8
16.7
= 0.6325
Therefore, this shows that Student A actually had a better score.
In order to receive the 2002 MMR Scholarship last year a student had to be in the 75th percentile or higher. But due to the Double Cohort this year, students must now be in the 80th percentile or above to get the award for 2003.
Scenario:
Emma received a score of 30 last year and won the award. Based on the matrix below, would she have still earned it if she was in the Double Cohort this year?
2 5 12 15 19 23 27 29 33 39
4 7 13 16 20 24 27 29 33 40
4 8 13 18 21 26 28 30 35 41
4 9 14 18 21 27 29 32 38 41
MATRIX Scores forYear 2003(Double Cohort):
Solution:
Percentile = (# of scores below x) + 0.5 (# of scores = to x) x 100
total # of scores
= 30 + 0.5 (1) x 100
40
= 30.5 x 100
40
= 0.7625
= 77th percentileTherefore, Emma is in the 77th percentile based on the information in the Double Cohort year. Due to the increased competition, she does not qualify for the Scholarship this year.
Correlation Coefficient
Classifying Linear Correlations
Non-Linear Regressions
Cause and Effect
Venn Diagram
Affect of Homework Hrs on Mark
y = 73.941e0.0187x
R2 = 0.2337
0
20
40
60
80
100
0 2 4 6 8 10
Homework (hrs)
Mar
k (%
)
The correlation coefficient was calculated as 0.484 in Excel (or by taking the square- root of R squared.
This number means that there is a moderate and positive linear correlation between
the number of hours spent on homework and the student’s avg. mark (between 0.33 and 0.67). Therefore, “Y” increases as “X” increases.
Negative LinearCorrelation
Positive LinearCorrelation
Strong Moderate Weak Weak Moderate Strong
Perfect Perfect
-1 - 0.67 - 0.33 0 0.33 0.67 1
Correlation Coefficient “r”
Double Cohort Affect on University Applications
y = 7608x2 - 3E+07x + 3E+10
R2 = 0.8612
0
100000
200000
300000
400000
500000
600000
1992 1994 1996 1998 2000 2002 2004
Year#
of A
pp
licat
ion
s
This graph shows the affect of the Double Cohort on the # of applications to University
The curve-of-best-fit shown is a Polynomial Regression
I chose this one because its R-squared value was closest to 1 (which means that it ismore accurate in terms of finding the relationship).
With this information we can predict the number of applications for 2004
• Both graphs have a “Cause and Effect” relationship.
• Graph A (Homework Hours vs. Marks) shows this Cause-and-Effect relationship because, generally speaking, the more hours you spend doing homework, the better your mark
• Graph B shows a Common-Cause Factor because as the population grows over the years, the more applications will be sent in with the growing number of students.
Outliers:
• The Double Cohort (the jump in 2003 in Graph B) can be said to be an outlier or an ‘extraneous variable’ because it does not fit with the rest of the data and may skew it
RESULTS:
19 students are in Athletics
7 are on the Student Council
8 participate in School Clubs
2 are involved in both the Athletics and the Student Council
1 student is involved with the Student Council and School Club
2 are part of the Athletics and a School Club
1 student does all three
50 students took this survey. The results are shown below.
Construct a Venn Diagram to show the relationships.
14 3
2
2
11
5
22
Athletics
School Clubs
Student Council
Solution:
This shows that 56 % of the students surveyed are involved in the school in some way. 10 % participate in two activities and only 1 person (2 %) are engaged in all three.
Scenario:
Queen’s University is selecting 5 people for their President’s Scholarship. They are choosing from an eligible group of 4 Grade 12’s and 5 OAC’s.
a) How many ways can they do this?
* Assumption: there are no restrictions and order does not matter *
9 C 5 = 126 ways
b) How many ways can they do this by choosing at least 3 Gr. 12’s?
Ways = (4 C 3 x 5 C 2) + (4 C 4 x 5 C 1) = 40 + 5 = 45
Scenario:
Scott applied to 3 Colleges. The probability of a student like him getting into College in 2003 is 75 %. What is the probability of him being accepted to at least one?X = {0,1,2,3}
* Assumption: this is a “success”/ “failure” scenario
P (x) = (n C x) (p^x) (q^n-x)
P (0) = 0.02 P (1) = 0.14 P (2) = 0.42 P (3) = 0.42
P (x > 0) = P(1) + P(2) + P(3)
= 98 %
There is increased competition, especially due to the New Curriculum (the Double Cohort)
Why there may be differences in the marks of the 2 grades:
11 % of Gr.12’s and 23 % of OAC’s are not involved in anything (in or out of school)
55 % of OAC’s and 39 % of Gr. 12’s work and/or volunteer for 13 hrs or more per week
Only 4 % of Grade 12’s don’t work or volunteer while this is true for 14 % of the OAC’s
There are more OAC’s (55 %) than Grade 12’s (43 %) who spend 6 or more hours doing leisure activities a week