SPSS Course Undergrads
-
Upload
paco-porras -
Category
Documents
-
view
227 -
download
0
description
Transcript of SPSS Course Undergrads
Define Variable (Data Menu)
Introduction to SPSS1. Introduction...31.1 SPSS Datafiles
2. Coding and Entering Data42.1 Coding Data (Variable View)2.2 Entering Data (Data View)
3. Descriptive Statistics..83.1 Frequency Distribution3.2 Crosstabs
3.3 Correlation
3.4 Means4. Statistical Tests..174.1 Chi Square Test
4.2 Parametric Tests
4.2.1 Independent Sample T-Test
4.2.2 One Way ANOVA4.2.3 Paired Sample T-Test
4.3 Non-Parametric Tests
4.3.1 Mann-Whitney U Test4.3.2 Kruskal-Wallis Test
4.3.3 Wilcoxon Test5. Linear Regression..236. Re-Coding Data..246.1 Recode Command
6.2 Compute Command
1. INTRODUCTION
SPSS (Statistical Package for the Social Sciences) is a very powerful and widely used Data Analysis Application. It provides a user-friendly tool for analysing Questionnaire Data and other Data Sets. It can also be used in conjunction with most standard spreadsheet and word processing packages to produce professional reports on the findings of any Analysis.
1.1 SPSS DATA FILES
Once SPSS is activated the user a new Datafile is opened automatically. This file contains two Windows, a Data View and a Variable View(see tabs on bottom left of the file).(i) Data View
This Data View consists of a grid of Columns and Rows, similar to a spreadsheet. The Columns represent Variables (this can be questions on a questionnaire etc.) and the Rows represent Cases (individuals or companies who fill in the questionnaire/ provide the data). The raw data is entered in this Window
(ii) Variable View
The variable view also contains a grid of rows and columns. However, in this Window the rows represent the variables in the analysis and the columns are used to define the characteristics of each variable. We will discuss this in greater detail in the next section.
2. CODING AND ENTERING DATA
2.1 Coding Data (Variable View)When processing data it is essential to develop a coding system. This is a system of numerical codes representing different values in the data set. In SPSS the user can enter this coding system using the Variable View Window. To illustrate this process we will look at a simple questionnaire and show how this should be coded in SPSS. This questionnaire is shown is Appendix A.
The first Question is
What is your gender?
and the codes for the responses are
1 = Male and 2 = Female
How can we represent this variable in SPSS?Variable View
To code a variable in SPSS we first need to enter the Variable View by clicking the Variable View tab at the bottom of the screen.
Variable Name
In the first column enter the variable name. The variable name can only contain alphanumeric characters, letters and numbers (no spaces). For the first variable in our questionnaire we will simply use the name Gender. (You could alternatively use the name Q1).Variable Type
The Variable Type allows the used enter the data in different forms, numerical, text, date, financial etc. In quantitative research projects it is recommended to use Numeric when appropriate and this is the default in SPSS.
Width and Decimals
Width and Decimals will be entered automatically and are only used for presentation purposes. In this case set the decimals to zero.Variable Label (Optional)If required enter a Label in the box provided. The Variable Label provides a more detailed description of the question than the Variable Name, as it can contain many more characters. In our example we might enter What is your gender as the Variable Label. If this variable is used in any analysis it is the Variable Label that appears in the Output. If no Variable Label is entered then the Variable Name is used in any Output.
Values
Value labels are used to define the coding system for the variable. To define a coding system for a variable, click on the Values cell and then click the little grey box in the cell. Type the Value and the corresponding Value Label in the boxes provided.
For Gender, type 1 as the Value and Male as the Value Label and then click the ADD button.
Then Type 2 as the Value and Female as the Value Label and click ADD.
When all Values and Value Labels are entered, click OK.
Missing Values
Invariably when data is collected using self-completed questionnaire forms there will be several questions left unanswered. This can happen for a variety of reasons, including simple carelessness, a lack of willingness on the part of the respondent to supply the desired information or a lack of competence to answer the question. Also a question may not be relevant to some of the respondents. In SPSS we must enter a code to represent missing data.
In this example we will enter the value 0 to represent a non-response. To enter a missing value code, click the grey box in the missing values cell and select discrete missing values. Enter 0 as the missing value and click OK
Columns, Align
These are purely for data presentation purposes
Measure
You must also define the measurement type in SPSS. This is important as this definition will affect the analysis you can do with the variable.
-Select the appropriate scale from the drop down list.
- There are three choices: Nominal, Ordinal or Scale* (Ratio or Interval). - Gender in a Nominal Variable.
*Ratio and Interval Measures are coded as Scale Measures in SPSSTo practice what we have just learned and to illustrate two other points lets now code some of the other variables on the questionnaireQ2. What is your age?
____Question 3 is relatively easy to code as it does not need value labels, because the responses to the questions are numerical
Q3. What level of education have you achieved?
Post-Graduate 1Degree 2Secondary 3
Question 2 can be coded using the same steps as Question 1. Choose an appropriate variable name yourself.
Q4.
Do you find time to relax?
Always 1 Usually 2 Sometimes 3 Rarely 4 Never 5
Q5.
Do you ever feel stressed at work?
Always 1 Usually 2 Sometimes 3 Rarely 4 Never 5
Questions 4 and 5 can also be coded like question 1. Notice that the responses to both questions are the same. To save time you can copy the labels from Question 4 and paste them in Question 5.
Q6. Does your job involve the following tasks
Evaluating Staff ___
Managing Staff ___
Training Staff
___
Question 6 is a multiple response question. The respondent is being asked three questions, namely
Does your job involve training staff?
Does your job involve managing staff?
Does your job involve evaluating staff?
The respondent can tick all three responses or none or any combination of them, so you will need three variables to represent this question.
Call these variables Task1, Task2 and Task3, use the three questions as the Variable labels and use the Values 1=Yes and 2= No for each variable.Q7. What is your job title?_________________
This is called an open question as there are no response categories given. To code this you need to wait until the questionnaires are returned and then code the different responses that are given. This is called post-coding. Lets assume here that after examining all the questionnaires there were four different response; Training Office, HR Manager, Managing Director and HR Director. We can now code these options as 1,2,3 and 4 and we can code it as we did Question 1.
Your final variable view should look like the graphic on the next page. Dont worry if you used different Variable names or labels.
2.2 Entering Data
To enter data to a SPSS file we must first return to the Data View. Data is simply typed into the appropriate cell with each cell representing one individuals answer to a given question.
The next screen contains the data for a number of respondents to this questionnaire.
From this data we can see that the first respondent is Male (gender = 1), is aged 55 has a postgraduate degree (Educ=1), usually finds time to relax, is never stressed at work etc. Now, enter some data into your own file. You can make up your responses (4 or 5 respondents is sufficient)Finally save the data file using the File Menu. The File Menu is SPSS is similar to other standard Windows packages. SPSS datafiles are given the prefix .sav, for instance we could call this file stdata.sav
3. DESCRIPTIVE STATISTICS
Descriptive Statistics are a group of techniques for describing the breakdown of a variable or variables. They include Frequency Tables (simply tables and charts of counts), Crosstabs, Mean Scores, and other Statistics.
To illustrate the use of these techniques we will use the data in the file staffdata.sav. This file can be found on Moodle.3.1 Frequencies
The Frequencies procedure provides basic statistics, tables and graphical displays that are useful for describing many types of variables. For a first look at your data, the Frequencies procedure is a good place to start.
To use the frequencies procedure
- Select Descriptive Statistics from the Analyse Menu
- Select Frequencies in the sub-group
- The Frequencies Window contains a list of all the variables in the file
a Variables box, a Display Frequency Tables boxand three pushbuttons namely Statistics, Charts and Format.
- Select the variables you are interested in from the Variables list on the left and place them in the Variables Box on the right (using the arrow in the center). For this exercise, choose Do you find time to relax and Do you ever feel stressed.
- For a Frequency Distribution Table tick the Display Frequency Distribution box. (This will be ticked by default)
- A selection of basic statistical values like the mean, standard deviation and others are available by clicking the Statistics button. - To select the appropriate statistic simply click the adjoining circle (see below). For this example select the mean and the standard deviation.
When all required statistics have been chosen click Continue- If you require a chart click the Charts Button and choose the type of chart required. There are three basic charts available, Pie Charts, Bar Charts and Histograms. In this case choose a Bar Chart and then click Continue.- Finally once all Statistics and Charts have been selected click OK .
Below we can see some the SPSS Output from the Frequency Command for the variable Do you find time to relax (Interval variable)Frequency Tables
Do you find time to relax?
FrequencyPercentValid PercentCumulative Percent
ValidAlways1511.011.011.0
Usually6245.645.656.6
Sometimes5137.537.594.1
Rarely85.95.9100.0
Total136100.0100.0
Statistics
Statistics
Do you find time to relax?
NValid136
Missing0
Mean2.3824
Std. Deviation.76069
Charts
Exercise: Generate the Descriptive Statistics for Gender and Age. Remember to use the appropriate descriptive statistics.The output for both of these variables is shown on the next two pages.
GENDER (Nominal Variable)Frequency Distribution
What is your gender
FrequencyPercentValid PercentCumulative Percent
ValidMale10476.576.576.5
Female3223.523.5100.0
Total136100.0100.0
Chart
Note; No quantitative statistics calculated as this is a Nominal VariableAGE (Ratio Variable)Statistics
Statistics
What is your age
NValid136
Missing0
Mean45.2868
Std. Error of Mean.79636
Median45.0000
Std. Deviation9.28711
Variance86.250
Minimum24.00
Maximum65.00
Percentiles2539.0000
5045.0000
7551.7500
Chart (Histogram)
Note: - A frequency Distribution for this variable would not be a useful descriptive statistic as there are too many values. The data could be recoded into a smaller number of categories and then graphed. (SPSS has a recode function)- A Histogram is chosen as Age has a large number of categories
3.2 Crosstabs
A Crosstab is an extension of a frequency table to 2 or more variables. For example you may wish to look at a breakdown of level of education by gender. To create a Crosstab Table
- Select Descriptive Statistics from the Analyse Menu
- Select Crosstabs in the sub-group
- From the Variable List select the variable(s) to represent the columns and place in the Columns box. For this exercise select Gender.
- Do the same thing for the rows box. Select Education Level for the rows.
- To add percentages click the Cells Button
- Choose the percentages required (e.g. column percentages) by clicking the appropriate circle.
Press ContinueFinally click OKThe SPSS Output for this Crosstab can be seen on the next page
What level of education have you achieved? * What is your gender Crosstabulation
What is your genderTotal
MaleFemale
What level of education have you achieved?Post Graduate StudiesCount421456
% within What is your gender40.4%43.8%41.2%
Primary DegreeCount561672
% within What is your gender53.8%50.0%52.9%
Leaving CertCount628
% within What is your gender5.8%6.3%5.9%
TotalCount10432136
% within What is your gender100.0%100.0%100.0%
3.3 Correlation- Select Correlate from the Analyse Menu
- Select Bivariate
- From the Variable List select the variables you wish to correlate. Choose Age and Number of years in the organisation
- Select the appropriate correlation Statistics
Pearson Ratio/Interval Data
Spearman or Kendalls Ordinal Data
Note: Age and Tenure are ratio variables so Pearson is correct coefficient
Press OKRepeat the correlation a second time. This time use the variables Do you find time to relax and Do you get stressed at work. Because these are both interval measurements we can also use the Pearson Correlation coefficient
The Output for both correlations is shown below
Correlation Age vs Tenure
Correlations
How many years are you in the organisation?What is you age
How many years are you in the organisation?Pearson Correlation1.884(**)
Sig. (2-tailed).000
N135135
What is you agePearson Correlation.884(**)1
Sig. (2-tailed).000
N135136
** Correlation is significant at the 0.01 level (2-tailed).
Correlation between age and number of years in organisation is .884. This indicates a positive relationship between the two variables, which is exactly what we would expect.
Note the two asterisks beside the correlation. This indicates that the correlation is statistically significant.
Correlation Do you find time to relax vs Do you get stressed at work.Correlations
Do you find time to relax?Do you ever feel stressed at work?
Do you find time to relax?Pearson Correlation1-.294**
Sig. (2-tailed).001
N136134
Do you ever feel stressed at work?Pearson Correlation-.294**1
Sig. (2-tailed).001
N134134
**. Correlation is significant at the 0.01 level (2-tailed).
This time the correlation between the variables is -.294, suggesting a significant negative relationship. In other words people who relax more dont feel as stressed at work.
Correlation Grade vs Do you find time to relaxGrade can be treated as an ordinal variable as it has a rank order. Time to relax is an interval variable. When correlating between an interval and ordinal variable we must use Spearmans or Kendalls Tau.Correlations
What is your grade?Do you find time to relax?
Spearman's rhoWhat is your grade?Correlation Coefficient1.000.166
Sig. (2-tailed)..056
N133133
Do you find time to relax?Correlation Coefficient.1661.000
Sig. (2-tailed).056.
N133136
The correlation between Grade and Do you find time to relax? is .166, but we can see that it is not statistically significant which indicates that there is no relationship between the two variables. Put simply, your grade has no impact on the time you find to relax.3.4 Means
This functions allows the user calculate the mean score of a dependent variable across several groups. For example, in our case study we could look at the average age of staff members with different level of education - Select Compare Means from the Analyse Menu- Select Means- Select the dependent variable(s) from the Variable List. Choose age for this example (this value must be a ratio/interval variable).- Select the Independent variable(s) form the Variable List. Level of Education is the independent variable for this example
5. Click OKHere is the associated Output from SPSS
Report
What is your age? What level of education have you achieved?MeanNStd. Deviation
Post Graduate Studies44.1250569.61639
Primary Degree45.9167729.38196
Leaving Cert47.750084.71320
Total45.28681369.28711
Note, that the means command also presents the sample size and standard deviation. It is good practice to present these three statistics together.3.5 Presentation and Tables CommandThe output from the basic descriptive commands in SPSS is not always in a form suitable for reports or presentations. There are a couple of approaches that a SPSS user can use to improve the quality of the output.Copying SPSS Output to MS WordTables and Charts generated in SPSS can be copied directly to MS Word using Copy or Copy Objects (if several tables or charts are being copied). Once in MS Word the tables and charts can then be manipulated in MS Word using the standard formatting tools. Tables Command
In the Analyse Menu there is a Tables sub-menu which allows users create their own self-defined tables. This is particularly useful if a large amount of tables are required. Please experiment with the Tables command to learn how it functions. Please note that variables should always be properly labelled if tables are required for reports or presentations.
4. STATISTICAL TESTS
4.1 Chi-Square Test (See Crosstabs)The Chi-Square Test is used when the test variable in a statistical test is a Nominal variable. For example, we can test if there is a significant difference between male and female education levels.- Select Descriptive Statistics from the Analyse Menu
- Select Crosstabs in the sub-group
- From the Variable List select the variable(s) to represent the columns and place in the Columns box (choose Gender).- Do the same thing for the rows box (choose Education Level)- Add Column percentages using the Cells button as before
- Click the Statistics button and choose Chi-Square Test- Click Continue- Click OK4.2 Parametric Tests
4.2.1 Independent Sample T-Test
Difference between 2 Independent Groups with 1 Ratio/Interval Test Variable. For example is there a significance difference in tenure between males and females? - In Analyse Menu Click Compare Means- Select Independent Sample T-test
- Select Test Variable(s) from Variable List (How many years in the organisation)- Select Grouping Variable from Variable List (Gender)- Click Define Groups button
- Insert the value labels representing the two independent groups. Here, the grouping variable is Gender (1=Male, 2=Female), so we enter 1 as Group 1 and 2 as Group 2.
- Click Continue- Click OKExample from Case StudyThe following SPSS screen shows the Independent Sample T-Test from the Case Study. How many years are you in the organisation is the Test Variable and the Gender is the grouping variable. The labels for the two groups are 1 and 2 representing males and females.
4.2.2 One-Way Anova
Testing the difference between 3 or more Independent Groups, 1 Ratio/Interval test variable. For example, is there a significant difference in tenure across educational levels?- In Analyse Menu Click Compare Means- Select One-Way ANOVA.
- Select the Test Variable(s) (Age) and place it in the Dependent Variable box.
- Select the Grouping Variable (Education Level) and place it in the Factor box.
- Select Options and click Descriptive Statistics.
- Click Continue- Click OKNote, if you require Post Hoc Tests, click the Post Hoc button and choose test (Tukey Test is the most commonly used).
4.2.3 Paired Sample T-Test
2 Ratio/Interval Test Variables
The paired sample T-Test compares the mean difference between our samples and the difference that we would expect to find between population means, it then takes account the standard error of the difference.1. In Analyse Menu Click Compare Means
2. Select Paired Sample T-test i.e Do you find time to relax & Do you feel stressed at work.3. Click on the first variable, and then click the second variable. They should both appear in the current selection as variables 1 and 2 respectively
4. Click the black arrow to enter these two variables into the Paired Variables Window.
5. Repeat the process for any other pair of variables to be tested
6. Click OK
4.3 Non-Parametric Tests
These tests are used for ordinal test variables or ratio and interval test variables, which are not normally distributed.4.3.1 Mann-Whitney U Test
Comparing the difference between 2 Independent groups with 1 Ordinal test variable.- In Analyse Menu Click Non-Parametric Tests
-Select Legacy settings- Select 2 Independent Groups- Select Test Variable(s) from Variable List (Time to relax)- Select Grouping Variable from Variable List (Gender)- Click Define Groups button.
- Insert the value labels representing the two independent groups. - Click Continue- Click OK4.3.2 Kruskal-Wallis Test
Comparing the difference between K>2 Independent groups with 1 Ordinal test variable.- In Analyse Menu Click Non-Parametric Tests-Select Legacy settings- Select K Independent Groups- Select Test Variable(s) from Variable List- Select Grouping Variable from Variable List- Click Define Range button.
- Insert the range of value labels representing the independent groups. - Click Continue- Click OK4.3.3 Wilcoxon Test
This test is used for comparing 2 Ordinal test variables.
- In Analyse Menu Click Non-Parametric Tests- Select Legacy Settings- Select 2 Related Samples- Click on the first variable, and then click the second variable. They should both appear in the current selection as variables 1 and 2 respectively
- Click the black arrow to enter these two variables into the Paired Variables Window.
- Repeat the process for any other pair of variables to be tested
- Click OK5. Linear RegressionRegression is used to examine the relationship between one dependent and one independent variable. After performing an analysis, the regression statistics can be used to predict the dependent variable when the independent variable is known. Regression goes beyond correlation by adding prediction capabilities. Linear regression is the most commonly used type and is used when the dependent variable is ratio/interval.- In the Analyse Menu Select Regression- Select Linear.
- Enter the variable that is being predicted as the Dependent at the top of the dialogue box- Enter the variable(s) being used to predict as the Independent(s). - Click OK
6. Manipulating DataFor analysis purposes it is often desirable to re-code a variable or to calculate a new variable using a combination of old variables.
6.1 Recode CommandA variable like Age with values (1= 20 34, 2 = 35 44, 3 = 45 54, 4 = 55+) can be re-coded into a new variable with two values, say (1 = Up to 44, 2 = 45 and over) using the following procedure
1. In the Transform Menu select Recode
2. Select Into Different Variables3. Select the Input variable (variable to be re-coded) from the Variable list and click the black arrow.
4. Enter the Name of the New (Output) Variable.
5. Enter a Label for the New Variable (Optional)
6. Click the Old and New Values button.
7. Enter the old value of the variable on the left side of the screen and the new value on the right and click the Add button.
For instance, in the example above a respondent who was in the range 35-44 will be coded as 2 in the old variable, but will be coded as 1 in the new variable.
8. Repeat this for all values of the old variable.
9. Click Continue
10. Click OKThe new variable should appear at the end of the variable list.
6.2 Compute CommandThe Compute command allows the user to form a new variable using a mathematical combination of several variables in the data set. In many surveys a researcher will ask several questions on a particular topic. To gauge the respondents overall view on the topic he can combine these questions and calculate the average score. To do this or calculate nay other combination of variables use the following procedure.
1. In the Transform Menu select Compute
2. Enter a name of the New Variable in the Target Variable box.3. In the Numerical Expression Window enter the formula!, using the variable list and the calculator, for calculating the new variable.
e.g Variable1 + Variable2 + Variable3Variable1 * Variable2 (* symbol represents multiplication)
Variable1 / Variable2 (! symbol represents divide)
4. Click OKThe new variable will appear at the end of the variable list.
!There are a large number of mathematical functions available for these calculations.APPENDIX A
Q1. What is your gender?
Male 1 Female 2
Q2. What level of education have you achieved?
Post-Graduate 1Degree
2Secondary 3
Q3. What is your age?
____Q4.
Do you find time to relax?
Always 1 Usually 2 Sometimes 3 Rarely 4 Never 5
Q5.
Do you ever feel stressed at work?
Always 1 Usually 2 Sometimes 3 Rarely 4 Never 5
Q6. Does your job involve the following tasks
Evaluating Staff ___
Managing Staff ___
Training Staff
___
Q7. What is your Job Title?___________________Q8. What is your grade?
Junior Management1Middle Management2
Senior Management3
Q9. How many years are you in the organisation?____
PAGE 16