Data Analysis Plan Handout
description
Transcript of Data Analysis Plan Handout
-
4/11/2015
1
Research Methods for Business & Managers
Requires not just identification of chosen techniques but a
meaningful discussion as to their suitability and, if possible, a
small discussion of the procedure involved in applying your
techniques. This section is heavily weighted in the overall
scheme of things so please pay due diligence thereto.
-
4/11/2015
2
Explanation through numbers
Objective
Deductive reasoning
Predefined variables and measurement
Data collection before analysis
Cause and effect relationships
Explanation through words
Subjective
Inductive reasoning
Creativity, extraneous variables
Data collection and analysis intertwined
Description, meaning
-
4/11/2015
3
Quantitative measures are typically referred to as variables.
A variable is anything that has different values eg numbers or
names
Any variable that is affected by or whose value is changed by
the occurrence of another variable is known as a dependent
(y) variable eg. If when pay is adjusted, performance changes
the performance is the dependent variable. Performance
can also be called the outcome.
Variables which are viewed as impacting upon the outcome,
are often referred to as independent (x) variables. So pay is
the independent variable.
-
4/11/2015
4
A nominal variable relates to a set of categories (such as ethnic groups,
political parties, gender )-which is not ordered or which cannot be
ranked/rated
An ordinal variable relates to a set of categories in which the categories
are ordered, (such as levels of educational qualification, organizational
rank, Likert scales)
An interval-level variable relates to a scale measure, (such as age or
income), that can be subjected to mathematical operations such as
averaging
Univariate analysis where a single variable is considered eg an analysis of
pay in a particular organization. Also known as simple statistics.
Bivariate analysis - where the relationship between two variables are
considered eg relationship between pay and performance. Also known as
effect or outcome statistics.
Multivariate analysis - where the aim is to explain why two variables are
related to other variable/s eg pay and working conditions impacting
performance and motivation. Also known as (multiple) effect or outcome
statistics.
-
4/11/2015
5
Descriptive or Simple Statistics
Summarize data
Effect Statistics:
Associational which measure connections
Inferential - which allows generalizations from samples to populations
Simple (or descriptive) statistics used for nominal and ordinal
variables
Usually displayed and described using frequencies,
proportions or odds
-
4/11/2015
6
Frequency Distribution
Counts and Percentages - A simple table showing how many, or what percent, of the cases fall into each variable category.
Central Tendency or Location
The mode is the most common or frequently occurring number.
The median is the middle point and the 50th percentile.
The mean, the arithmetic average, is the most widely used measure of central tendency
Measuring Dispersion (Spread)
You can measure variation in three ways: range, percentile, and standard deviation.
Range consists of the largest and smallest scores
Percentiles tell us the score at a specific place within the distribution.
Standard deviation = a widely used measure of the variability of a variable that indicates the
average distance of cases from the mean value.
Z-scores = a standardized measure that allows comparisons of groups that differ in their means
and standard deviations.
Charts and graphs are suitable for presenting and
summarizing frequency data
Type of Charts
Bar Chart, Pie Chart
Histogram
Frequency Polygon
Type of Data Bar Chart Pie Chart Histogram Frequency Polygon
Nominal X X
Ordinal X
Interval X X
-
4/11/2015
7
Do you want to know how many individuals checked each answer? Frequency
Do you want the proportion of people who answered in a certain way? Percentage
Do you want the average number or average score? Mean
Do you want the middle value in a range of values or scores? Median
Do you want to show the range in answers or scores? Range
Do you want to compare one group to another? Cross tab
Do you want to show the degree to which a response varies from the mean?
Standard deviation
Depend on the type of y and x variables. Main ones:
Y X Test Shows
numeric numeric linear regression slope, intercept, correlation
numeric nominal t-test ;ANOVA difference in mean
nominal nominal chi-square; contingency table
differences in frequency of ratio
nominal numeric categorical modeling relative risk or odds ratio
ordinal
whatever regression; t-test; implies causal direction
-
4/11/2015
8
Measures of Association measures the strength of the
association between 2 variables
Covariation or correlation = When two or more variables go together
or are associated with one another.
Statistical Independence = The absence of an association or
covariation between two variables.
Quantitative Analysis Techniques - Examples of associational statistics
Method Purpose Examples of application
Cross-tabulations Frequency distribution A preference for a brand of cereal based on gender
Scatter diagrams Frequency distribution Exploring the link between car mileage and petrol
consumption
-
4/11/2015
9
Scattergrams
A graph on which you plot the value of each case or observation. Each
axis of the graph represents the values of one variable, and the graph
can reveal bivarate relations.
Bi-variate cross-tabulation = Placing two variables in a
table at the same time allow you to see how cases that
have values on one variable align with values on a second
variable for those same cases.
Multi-variate cross-tabulation a table with two or
more variables that has been cross-tabulated
-
4/11/2015
10
Gender * Promotions Crosstabulation
Promotions
Not Promoted Promoted Total Gender Male Count
812 385 1197
Expected Count 800.8 396.2 1197.0
% within Gender 67.8% 32.2% 100.0%
% within Promotions 95.9% 91.9% 94.5%
Female Count 35 34 69
Expected Count 46.2 22.8 69.0
% within Gender 50.7% 49.3% 100.0%
% within Promotions 4.1% 8.1% 5.5%
Total Count 847 419 1266
Expected Count 847.0 419.0 1266.0
% within Gender 66.9% 33.1% 100.0%
% within Promotions 100.0% 100.0% 100.0%
Contingency table = A table with two or more variables
that have been cross-tabulated.
Department No. of Male Managers
Salary Ranges No. of Female Managers
Salary Ranges
Production 16 $2500-$5500 22 $2000-$5000
Sales 11 $4000-$7000 16 $3500-$6500
Accounting 9 $4500-$7500 8 $4000-$7000
Human Resources 5 $4oo0-$7000 9 $4000-$7000
Marketing 1 $4000-$7000 3 $4000-$7000
-
4/11/2015
11
Involves using quantitative data collected from a sample to
draw conclusions about a complete population
Population includes the totality of observations that might
be made
Whereas, a sample comprises a subset of the population
where observations will be or have been made
Hypothesis testing Confidence intervals Time series analysis Pearsons coefficient (P) Spearmans coefficient of rank correlation (NP) Students t-Test Simple regression (P) Multiple regression (P)
-
4/11/2015
12
Components Procedures Outcomes
Data Reductions
Data Display
Conclusions &
Verification
Coding
Categorisation
Abstraction
Comparison
Dimensionalisation
Integration
Interpretation
Description
Explanation/
Interpretation
-
4/11/2015
13
As the name implies, similar to grounded theory as described
in our look at research strategies
Given this reasoning, 3 key steps normally involved in this
type of analysis:
Open coding the initial attempt to develop categories which
illuminate the data
Axial coding saturation of categories and development of
subcategories
Selective coding - the process of integrating and refining categories
to form a larger theoretical scheme
Appropriate for data that are collected through narrative discourse
Where the data are analyzed by following the sequence of the narrative
to ensure that meaning and context are not lost
Usually follows a pattern:
What is the story about
What happened, to whom, where, and why
What were the consequences of this
What is the significance of these events
What was the final outcome
-
4/11/2015
14
Focuses on language as a social practice in its own right
and is concerned with how individuals use language in
specific social contexts
Enables researcher to gain an understanding of how and
why individuals use language to construct themselves
and the world around them
Many different branches most popular critical
discourse analysis
Involves analyzing images that may come from primary or
secondary findings
Used for example:
When you wish to analyze how many magazine ads used
celebrity endorsements
What is the most popular USP of ads
Although less time consuming that other methods, it is
more challenging to interpret data on the basis of visual
images
-
4/11/2015
15
Analysis of written documents
Developing categories of words and phrases
Looks at frequency of words, uses word counts
Used for historical trends
e.g. feminism in womens magazines over the last 10 years
e.g. number of centimetres devoted to sport in newspapers
Can be used to analyse interview texts
e.g. counting expressions of conflict