Data Analysis. A Few Necessary Terms Categorical Variable: Discrete groups, such as Type of Reach...
Transcript of Data Analysis. A Few Necessary Terms Categorical Variable: Discrete groups, such as Type of Reach...
![Page 1: Data Analysis. A Few Necessary Terms Categorical Variable: Discrete groups, such as Type of Reach (Riffle, Run, Pool) Continuous Variable: Measurements.](https://reader036.fdocuments.us/reader036/viewer/2022062320/56649f515503460f94c73d7b/html5/thumbnails/1.jpg)
Data Analysis
![Page 2: Data Analysis. A Few Necessary Terms Categorical Variable: Discrete groups, such as Type of Reach (Riffle, Run, Pool) Continuous Variable: Measurements.](https://reader036.fdocuments.us/reader036/viewer/2022062320/56649f515503460f94c73d7b/html5/thumbnails/2.jpg)
A Few Necessary Terms
Categorical Variable: Discrete groups, such as Type of Reach (Riffle, Run, Pool)
Continuous Variable: Measurements along a continuum, such as Flow Velocity
What type of variable would “Mottled Sculpin /meter2” be?
What type of variable is “Substrate Type”?
What type of variable is “% of bank that is undercut”?
![Page 3: Data Analysis. A Few Necessary Terms Categorical Variable: Discrete groups, such as Type of Reach (Riffle, Run, Pool) Continuous Variable: Measurements.](https://reader036.fdocuments.us/reader036/viewer/2022062320/56649f515503460f94c73d7b/html5/thumbnails/3.jpg)
A Few Necessary Terms
Explanatory Variable: Independent variable. On x-axis. The variable you use as a predictor.
Response Variable: Dependent variable. On y-axis. The variable that is hypothesized to depend on/be predicted by the explanatory variable.
![Page 4: Data Analysis. A Few Necessary Terms Categorical Variable: Discrete groups, such as Type of Reach (Riffle, Run, Pool) Continuous Variable: Measurements.](https://reader036.fdocuments.us/reader036/viewer/2022062320/56649f515503460f94c73d7b/html5/thumbnails/4.jpg)
Statistical Tests: Appropriate Use
For our data, the response variable will always be continuous.
T-test: A categorical explanatory variable with 2 options.
ANOVA: A categorical explanatory variable with >2 options.
Regression: A continuous explanatory variable
![Page 5: Data Analysis. A Few Necessary Terms Categorical Variable: Discrete groups, such as Type of Reach (Riffle, Run, Pool) Continuous Variable: Measurements.](https://reader036.fdocuments.us/reader036/viewer/2022062320/56649f515503460f94c73d7b/html5/thumbnails/5.jpg)
Statistical Tests
Hypothesis Testing: In statistics, we are always testing a Null Hypothesis (Ho) against an alternate hypothesis (Ha).
Test Statistic:
p-value: The probability of observing our data or more extreme data assuming the null hypothesis is correct
Statistical Significance: We reject the null hypothesis if the p-value is below a set value, usually 0.05.
![Page 6: Data Analysis. A Few Necessary Terms Categorical Variable: Discrete groups, such as Type of Reach (Riffle, Run, Pool) Continuous Variable: Measurements.](https://reader036.fdocuments.us/reader036/viewer/2022062320/56649f515503460f94c73d7b/html5/thumbnails/6.jpg)
Tests the statistical significance of the difference between means from two independent samples
Student’s T-Test
![Page 7: Data Analysis. A Few Necessary Terms Categorical Variable: Discrete groups, such as Type of Reach (Riffle, Run, Pool) Continuous Variable: Measurements.](https://reader036.fdocuments.us/reader036/viewer/2022062320/56649f515503460f94c73d7b/html5/thumbnails/7.jpg)
Cross Plains Salmo Pond
Mottled Sculpin/m2
Compares the means of 2 samples of a categorical variable
![Page 8: Data Analysis. A Few Necessary Terms Categorical Variable: Discrete groups, such as Type of Reach (Riffle, Run, Pool) Continuous Variable: Measurements.](https://reader036.fdocuments.us/reader036/viewer/2022062320/56649f515503460f94c73d7b/html5/thumbnails/8.jpg)
Precautions and Limitations
• Meet Assumptions
• Observations from data with a normal distribution (histogram)
• Samples are independent
• Assumed equal variance (boxplot)
• No other sample biases
• Interpreting the p-value
![Page 9: Data Analysis. A Few Necessary Terms Categorical Variable: Discrete groups, such as Type of Reach (Riffle, Run, Pool) Continuous Variable: Measurements.](https://reader036.fdocuments.us/reader036/viewer/2022062320/56649f515503460f94c73d7b/html5/thumbnails/9.jpg)
Analysis of Variance (ANOVA)Tests the statistical significance of the difference between means from two or more independent samples
ANOVA website Riffle Pool Run
Grand MeanMottled
Sculpin/m2
![Page 10: Data Analysis. A Few Necessary Terms Categorical Variable: Discrete groups, such as Type of Reach (Riffle, Run, Pool) Continuous Variable: Measurements.](https://reader036.fdocuments.us/reader036/viewer/2022062320/56649f515503460f94c73d7b/html5/thumbnails/10.jpg)
Precautions and Limitations
• Meet Assumptions
• Observations from data with a normal distribution
• Samples are independent
• Assumed equal variance
• No other sample biases
• Interpreting the p-value
• Pairwise T-tests to follow
![Page 11: Data Analysis. A Few Necessary Terms Categorical Variable: Discrete groups, such as Type of Reach (Riffle, Run, Pool) Continuous Variable: Measurements.](https://reader036.fdocuments.us/reader036/viewer/2022062320/56649f515503460f94c73d7b/html5/thumbnails/11.jpg)
Simple Linear Regression
• What is it? Least squares line
•When is it appropriate to use?
•Assumptions?
•What does the p-value mean? The R-value?
• How to do it in excel
![Page 12: Data Analysis. A Few Necessary Terms Categorical Variable: Discrete groups, such as Type of Reach (Riffle, Run, Pool) Continuous Variable: Measurements.](https://reader036.fdocuments.us/reader036/viewer/2022062320/56649f515503460f94c73d7b/html5/thumbnails/12.jpg)
Simple Linear Regression
R2 = 0.6955
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0 0.1 0.2 0.3 0.4 0.5
Mottled Sculpin/Meter^2
Bro
wn
Tro
ut/
Met
er^2
Tests the statistical significance of a relationship between two continuous variables, Explanatory and Response
![Page 13: Data Analysis. A Few Necessary Terms Categorical Variable: Discrete groups, such as Type of Reach (Riffle, Run, Pool) Continuous Variable: Measurements.](https://reader036.fdocuments.us/reader036/viewer/2022062320/56649f515503460f94c73d7b/html5/thumbnails/13.jpg)
Precautions and Limitations
• Meet Assumptions
• Observations from data with a normal distribution
• Samples are independent
• Assumed equal variance
• Relationship is linear
• No other sample biases
• Interpret the p-value and R-squared value.
![Page 14: Data Analysis. A Few Necessary Terms Categorical Variable: Discrete groups, such as Type of Reach (Riffle, Run, Pool) Continuous Variable: Measurements.](https://reader036.fdocuments.us/reader036/viewer/2022062320/56649f515503460f94c73d7b/html5/thumbnails/14.jpg)
Residual Plots
Residuals are the distances from observed points to the best-fit line
Residuals always sum to zero
Regression chooses the best-fit line to minimize the sum of square-residuals. It is called the Least Squares Line.
![Page 15: Data Analysis. A Few Necessary Terms Categorical Variable: Discrete groups, such as Type of Reach (Riffle, Run, Pool) Continuous Variable: Measurements.](https://reader036.fdocuments.us/reader036/viewer/2022062320/56649f515503460f94c73d7b/html5/thumbnails/15.jpg)
R2 = 0.6955
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0 0.1 0.2 0.3 0.4 0.5
Mottled Sculpin/Meter^2
Bro
wn
Tro
ut/
Met
er^2
Residuals
![Page 16: Data Analysis. A Few Necessary Terms Categorical Variable: Discrete groups, such as Type of Reach (Riffle, Run, Pool) Continuous Variable: Measurements.](https://reader036.fdocuments.us/reader036/viewer/2022062320/56649f515503460f94c73d7b/html5/thumbnails/16.jpg)
Residual vs. Fitted Value Plots
-0.15
-0.1
-0.05
0
0.05
0.1
0.15
0 0.1 0.2 0.3 0.4 0.5
Fitted Values (MS_CPUA)
Re
sid
ua
ls
Model Values (Line)
Observed Values (Points)
![Page 17: Data Analysis. A Few Necessary Terms Categorical Variable: Discrete groups, such as Type of Reach (Riffle, Run, Pool) Continuous Variable: Measurements.](https://reader036.fdocuments.us/reader036/viewer/2022062320/56649f515503460f94c73d7b/html5/thumbnails/17.jpg)
Residual Plots Can Help Test Assumptions
0
“Normal” Scatter
0
0Fan Shape: Unequal Variance
Curve (linearity)
![Page 18: Data Analysis. A Few Necessary Terms Categorical Variable: Discrete groups, such as Type of Reach (Riffle, Run, Pool) Continuous Variable: Measurements.](https://reader036.fdocuments.us/reader036/viewer/2022062320/56649f515503460f94c73d7b/html5/thumbnails/18.jpg)
-0.15
-0.1
-0.05
0
0.05
0.1
0.15
0 0.1 0.2 0.3 0.4 0.5
Fitted Values (MS_CPUA)
Re
sid
ua
ls
R2 = 0.6955
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0 0.1 0.2 0.3 0.4 0.5
Mottled Sculpin/Meter^2
Bro
wn
Tro
ut/
Met
er^2
Have we violated any assumptions?
![Page 19: Data Analysis. A Few Necessary Terms Categorical Variable: Discrete groups, such as Type of Reach (Riffle, Run, Pool) Continuous Variable: Measurements.](https://reader036.fdocuments.us/reader036/viewer/2022062320/56649f515503460f94c73d7b/html5/thumbnails/19.jpg)
R-Squared and P-value
High R-Squared
Low p-value (significant relationship)
![Page 20: Data Analysis. A Few Necessary Terms Categorical Variable: Discrete groups, such as Type of Reach (Riffle, Run, Pool) Continuous Variable: Measurements.](https://reader036.fdocuments.us/reader036/viewer/2022062320/56649f515503460f94c73d7b/html5/thumbnails/20.jpg)
R-Squared and P-value
Low R-Squared
Low p-value (significant relationship)
![Page 21: Data Analysis. A Few Necessary Terms Categorical Variable: Discrete groups, such as Type of Reach (Riffle, Run, Pool) Continuous Variable: Measurements.](https://reader036.fdocuments.us/reader036/viewer/2022062320/56649f515503460f94c73d7b/html5/thumbnails/21.jpg)
R-Squared and P-value
High R-Squared
High p-value (NO significant relationship)
![Page 22: Data Analysis. A Few Necessary Terms Categorical Variable: Discrete groups, such as Type of Reach (Riffle, Run, Pool) Continuous Variable: Measurements.](https://reader036.fdocuments.us/reader036/viewer/2022062320/56649f515503460f94c73d7b/html5/thumbnails/22.jpg)
R-Squared and P-value
Low R-Squared
High p-value (No significant relationship)
![Page 23: Data Analysis. A Few Necessary Terms Categorical Variable: Discrete groups, such as Type of Reach (Riffle, Run, Pool) Continuous Variable: Measurements.](https://reader036.fdocuments.us/reader036/viewer/2022062320/56649f515503460f94c73d7b/html5/thumbnails/23.jpg)
P-value indicates the strength of the relationship between the two variables
You can think of this as a measure of predictability
R-Squared indicates how much variance is explained by the explanatory variable.
If this is low, other variables likely play a role. If this is high, it DOES NOT INDICATE A SIGNIFICANT RELATIONSHIP!