Important Ideas in Data Analysis for PreK-12 Students, Teachers, and Teacher Educators Denise S....
-
Upload
bruce-berry -
Category
Documents
-
view
214 -
download
0
Transcript of Important Ideas in Data Analysis for PreK-12 Students, Teachers, and Teacher Educators Denise S....
Important Ideas in Data Analysis for PreK-12 Students, Teachers, and
Teacher Educators
Denise S. Mewborn
University of Georgia
Data analysis/statistics…
• helps us answer questions.
• helps us make better decisions.
• helps us describe and understand our world.
• helps us quantify variability.
What questions can we ask?
• Where are you from?
• How did you get here?
• How long are you staying?/What day are you leaving?
• How many times have you been to TEAM?
• What is your day job?
Where's home?
0
10
20
30
40
50
60
70
80
90
100
North South East West Central Out ofState
Location
Nmber of people
Series1
Wait! There’s more!!!!
• Analyze and interpret data– Answer the original question– Make inferences– Make predictions– What other questions can we answer with
this data display?
Standards 2000
Instructional programs should enable all students to–
• formulate questions that can be addressed with data and collect, organize, and display relevant data to answer them;
• select and use appropriate statistical methods to analyze data;
• develop and evaluate inferences and predictions that are based on data;
Statistical Problem Solving
• Formulate Questions– clarify the problem at hand – formulate question(s) that can be answered with data
• Collect Data– design a plan to collect appropriate data– employ the plan to collect the data
• Analyze Data– select appropriate graphical or numerical methods– use these methods to analyze the data
• Interpret Results– interpret the analysis – relate the interpretation to the original question
Main Points
• We are not asking enough of students!!!• We are not providing them with rich enough
experiences in data analysis to enable them to move confidently into higher grades or to make sense of the world.
• Statistics is an opportunity to APPLY lots of other mathematical ideas in a context.
• Need to end the “mean-median-mode ad nauseum” pattern we’ve been using.
Big ideas that need more attention
• Context– Why do we want to know these things?
• Variability– natural vs. induced
• Inference, prediction
THE GAISE FRAMEWORK MODEL
ProcessComponent
Level A Level B Level C
FormulateQuestion
Beginning awareness of the statistics question distinction
Increased awareness of the statistics question distinction
Students can make the statistics question distinction
CollectData
Do not yet design for differences
Awareness of design for differences
Students make designs for differences
AnalyzeData
Use particular properties of distributions in context of specific example
Learn to use particular properties of distributions as tools of analysis
Understand and use distributions in analysis as a global concept
InterpretResults
Do not look beyond the data
Acknowledge that looking beyond the data is feasible
Able to look beyond the data in some contexts
THE FRAMEWORK MODEL
Nature ofVariability
Focus onVariability
Measurement variabilityNatural variabilityInduced variability Variability within a group
Sampling variability
Variability within a group and variability between groups Co-variability
Chance variability
Variability in model fitting
Classroom Census
• Most common and most appropriate type of data collection for PreK-5
• Involves collecting and analyzing data about us/our classroom
• Examples– Favorite ______– Type of shoes– Lunch count– Weather– Birthdays– Bus riders/car
riders/walkers
Pushing to higher levels
• Formulate questions– Allow children to generate questions from a
context• Tie shoes vs. not tie shoes• Tie shoes, slip-on shoes, buckle shoes• Shoe color• Type of soles• Material from which shoe is made
Pushing…
• Collect data– What data do we need in order to answer
our question?– How could we get this data?
• Use actual shoes• Raise hands and count• Use Unifix cubes to make towers• Use sticky notes to make a graph
Pushing…
• Analyze data Decide on an appropriate graphical
representation Describe the shape of the distribution Locate individuals within group data
Pushing…
• Interpret results– Answer the original question– Make inferences
• Why might so many people be wearing tie shoes today?
– Make predictions• Would you expect the same results if we collected this
data in December?• Would we get the same results if we collected data
from Ms. Murphy’s class?• Would we get the same results if we went to <local
business> and collected data?
Pushing…
• Extending to new problems– What other questions could we answer with
this data?• How many more people are wearing tie shoes
than slip-on shoes?• How many people are wearing tie shoes or
buckle shoes?
Simple Experiment
• Science experiment– Beans grown in dark or light
• Comparison of 2 existing items– Sugar content in bubble gum vs. minty gum
Simple experiment
• Formulate questions– What things affect how well a bean plan
grows? (light, soil, water, temperature)– What does it mean that a bean “grows
well?”– Which condition are we most interested in
investigating?
Simple experiment
• Collect Data– Plan the experiment
• Decide what data to collect (height of beans)• How will we collect it? (ruler–inches vs.
centimeters, Unifix cubes, string)• When will we collect it?
– Conduct the experiment
Simple experiment
• Analyze Data– Dot plot– Did all beans from one condition grow
better than all beans from the other condition?
– Answer the original question.
Simple experiment
• Interpret Results– Does this fit with what you know and observe
about growing flowers, plants, and vegetables?– Why didn’t some beans in the light sprout at all?– Does this mean we can’t grow plants inside?
• Predict– Does it matter what kind of seeds we use?
• Extend– How much taller was the tallest bean than the
shortest bean?
Evolution of the mean
• Level A: fair share
• Level B: balance point of a distribution
• Level C: distribution of sample means
• The Family Size Problem: How large are families today?
Which group is “closer” to being “fair?”
The blue group is closer to fair since it requires only one “step” to make it fair. The lower group requires two “steps.”
How do we define a “step?”
When a snap cube is removed from a stack higher than the fair share value and placed on a stack lower than the fair share value, we count a step.
“fairness” ~ number of steps to make it fair
Fewer steps is closer to fair
Students completing Level A understand:
• the notion of “fair share” for a set of numeric data
• the fair share value is also called the mean value
• the algorithm for finding the mean
• the notion of “number of steps” to make fair as a measure of variability about the mean
• the fair share/mean value provides a basis for comparison between two groups of numerical data with different sizes (thus can’t use total)
-+--+--+--+--+--+--+--+--+- 2 3 4 5 6 7 8 9 10
-+--+--+--+--+--+--+--+--+- 2 3 4 5 6 7 8 9 10
4 2 1
1
0 1 2
2
3
4 3 2 0
0
0
2 3 4
In Distribution 1, the Total Distance from the Mean is 16.
In Distribution 2, the Total Distance from the Mean is 18.
-+--+--+--+--+--+--+--+--+- 2 3 4 5 6 7 8 9 10
4 2 1
1
0 1 2
2
3
The total distance for the values below the mean of 6 is 8, the same as the total distance for the values above the mean. So, the distribution will “balance” at 6 (the mean).
The SAD is defined to be:
The Sum of the Absolute Deviations
Relationship between SAD and Number of Steps to Fair from Level A:
SAD = 2 x number of steps
-+--+--+--+--+--+--+--+--+- 2 3 4 5 6 7 8 9 10
-+--+--+--+--+--+--+--+--+- 2 3 4 5 6 7 8 9 10
4 4
1
1
1 1
1
1
11
The SAD is 8 for each distribution, but in the first distribution the data vary more from the mean.
Why doesn’t the SAD work?
Measuring Variation about the Mean
• SAD = Sum of Absolute Deviations
• MAD = Mean of Absolute Deviations
• Variance = Mean of Squared Deviations
• Standard Deviation = Square Root of Variance
Summary of Level B and Transitions to Level C
• Mean as the balance point of a distribution
• Mean as a “central” point
• Various measures of variation about the mean
Level C
• Sampling distribution of the sample means– Links probability and statistics– Transition from descriptive to inferential
statistics
Activity
• Choose 10 circles that you think have a diameter close to the mean. Find the mean diameter of your 10 circles.
vs.
• Select random samples of 10 circles and find the mean.
Sample Means2.22.01.81.61.41.21.0
Random Selection
Self Selection
Dotplot of Random Selection versus Self Selection
Population Mean = 1.25
Sampling Distributions provide the link to two important concepts in statistical inference:
• Margin of Error
• Statistical Significance
Resources
• NCTM Principles and Standards
• GAISE Framework
• NCTM Navigations series
• Quantitative Literacy series
Statistical Problem Solving
• Formulate Questions– clarify the problem at hand – formulate question(s) that can be answered with data
• Collect Data– design a plan to collect appropriate data– employ the plan to collect the data
• Analyze Data– select appropriate graphical or numerical methods– use these methods to analyze the data
• Interpret Results– interpret the analysis – relate the interpretation to the original question