How to identify variables in your dataset

9
Created by ASK (2012) Page 1 of 9 Table of Contents How to Identify Variables in Your Dataset .................................................................. 2 1.1 What is a variable? .................................................................................................................... 2 1.2 Identifying variables from an experiment................................................................................. 2 1.3 Identifying variables from a questionnaire ............................................................................... 5 1.4 Arranging data in a spreadsheet ............................................................................................... 8 Additional Resources.................................................................................................. 9

description

In order to enter data in SPSS you need to know what your variables are. This tutorial helps you identify the variables in your dataset.

Transcript of How to identify variables in your dataset

Page 1: How to identify variables in your dataset

Created by ASK (2012) Page 1 of 9

Table of Contents

How to Identify Variables in Your Dataset .................................................................. 2

1.1 What is a variable? .................................................................................................................... 2

1.2 Identifying variables from an experiment................................................................................. 2

1.3 Identifying variables from a questionnaire ............................................................................... 5

1.4 Arranging data in a spreadsheet ............................................................................................... 8

Additional Resources.................................................................................................. 9

Page 2: How to identify variables in your dataset

Created by ASK (2012) Page 2 of 9

How to Identify Variables in Your Dataset

1.1 What is a variable?

A variable is a measurement of…

A characteristic (e.g., Gender, Age, Height, Weight…)

An activity or task (e.g., time to complete a task, 6 minute walk test…)

Time points (e.g., pre-test, post-test, T0, T1, T2…)

Experimental condition (e.g., Condition, Experimental group…)

Opinion/belief (e.g., A survey question which asks for a respondent’s level of agreement

with a statement)

Etc…

You will have multiple variables in your dataset. It is important to identify your variables in order

to correctly arrange your data in Excel or SPSS.

1.2 Identifying variables from an experiment

Look at your dataset and ask the following questions:

1. Do I have any independent categorisations or groupings? (Independent here means that your

subjects can only be categorised into one of several groups). Figure 1 shows the examples

given below.

EXAMPLES:

I have male and female subjects.

The variable here is Gender . Under Gender, the data for each subject is

recorded as either male or female , but cannot be both.

I have a control group & one or more experimental groups.

The variable here is Experimental condition . Under this variable, the data

for each subject is recorded as the group they have been assigned for the

duration of the experiment. Subjects have been assigned to exactly one

condition.

I have categorised each subject as either under weight, normal weight, overweight or

obese, based on their BMI.

The variable here is Weight group . Under this variable, the data for each

subject is recorded as the weight group corresponding to their BMI. Each

subject is categorised into exactly one group.

Page 3: How to identify variables in your dataset

Created by ASK (2012) Page 3 of 9

Figure 1. Independent categorisations or groupings. (Back to description).

2. Do I have any quantitative measurements taken of my subjects? (Quantitative here means that

the measurements are numbers and not groups, categories, words or text). Figure 2 shows the

examples given below.

EXAMPLES:

I have the weight (in kg, stones, lbs, etc…) of each subject.

The variable here is Weight . Under this variable, the data for each subject is

recorded as their measured weight. NOTE: this is different from Weight

group in the example above, because here you have not grouped or

categorised their weights – you are recording the actual weight for each

subject.

I have the height (in cm, m, inches, feet, etc…) of each subject.

The variable here is Height . Under this variable, the data for each subject is

recorded as their actual measured height. Always use only one unit of

measure and use the same unit of measure for each subject. E.g., instead of 1

metre and 53 cm, write as 1.53 metres.

I have the age (in days, months, years, etc…) for each subject.

The variable here is Age . Under this variable, the data for each subject is

recorded as their actual age (not age group). Always use only one unit of

measure and use the same unit of measure for each subject. E.g., instead of

23 years and 6 months, write all in years (23.5 yrs) or all in months (282

months).

I gave each subject a test and recorded their total score.

The variable here is Score . Under this variable, the data for each subject is

recorded as their test score.

Each subject walked for 6 minutes and I recorded how far (in metres) they were able to

walk.

The variable here is Distance . Under this variable, the data for each subject

is recorded as the distance that each subject walked in 6 minutes. Always use

one unit of measure and use the same unit of measure for each subject.

I counted “how many _______” for each subject. For example, “how many cells die/live

after treatment A”, “how many failures in 1 hour”, “how many hours worked per week”,

etc…

The variable here is Frequency . Under this variable, the data for each

subject is recorded as the “number of _______” (fill in the blank with your

measure).

Page 4: How to identify variables in your dataset

Created by ASK (2012) Page 4 of 9

Figure 2. Quantitative measurements. (Back to description).

3. Do I have any repeated measures data? That is, have I taken the same measurements from all

subjects at several time points or under several conditions? In this case, each time point or

each condition is its own variable. Figure 3 shows two of the examples given below.

EXAMPLES:

Each subject walked for 6 minutes and I recorded how far (in metres) they were able to

walk, both pre-test and post-test (after 6 months of rehabilitation).

The variables here are Pre distance and Post distance . Record each of these

as quantitative data as described in (2) above.

I gave each subject a test before they started the study. Each subject was then subjected

to condition 1 and afterwards, they took the test again. Then, each subject was subjected

to condition 2 and took the test a third time.

The variables here are Pre score , Cond1 score and Cond2 score . Record

each of these variables as quantitative data as described in (2) above.

I have categorised each subject as either under weight, normal weight, overweight or

obese, based on their BMI at baseline and then 6 months after starting an exercise regime.

The variables here are Baseline weight group and Post weight group .

Record the data for each of these as independent groupings or categorisations

as described in (1) above.

Figure 3. Repeated measurements of all subject. (Back to description).

Page 5: How to identify variables in your dataset

Created by ASK (2012) Page 5 of 9

1.3 Identifying variables from a questionnaire

Look at your questionnaire and ask the following questions:

1. Do I have any single response questions? (Single response here means that participants select

one response out of the options given).

EXAMPLES:

Treat these as independent categorisations or groupings as described in (1) above.

The variable here is Role . Under this variable, the data for each participant

is recorded as staff, student or visitor.

Although this is presented as one question in matrix format, each of the items listed in

column 1 are actually separate questions. All 4 of these questions use likert scales and

thus should be treated as independent categorisations or groupings as described in (1)

above.

The variables here are Cost, Distance to Uni , Distance to work and I feel

safe . For each variable, the data for each participant is recorded as “very

important”, “important”, “unimportant” or “very unimportant” .

2

Very

Important Important Unimportant

Very Unimportant

Cost

Distance to Uni

Distance to work

I feel safe

How important are the following when considering where to live?

1

Page 6: How to identify variables in your dataset

Created by ASK (2012) Page 6 of 9

2. Do I have any multiple response questions? (Multiple response here means that participants

select one or more responses out of the options given).

EXAMPLES:

Because only enter 1 piece of information per participant into a spreadsheet, you cannot

treat question 3 as one variable. Each response is a variable. The data for each variable

(except for Other) is then Yes (it was ticked) or No (it wasn’t ticked). For the variable

Other, enter all the responses given by participants. Treat each variable as independent

categorisations or groupings as described in (1) above (see Figure 4 for an example).

The variables here are Hybrid/electric , Foot, Cycle , Public transport ,

Car/taxi and Other . Under each variable (except Other), the data for each

participant is recorded as either Yes or No. If a participant ticked Other ,

simply enter the response they gave. You should not create a new variable for

each response given by participants under Other . If a participant gives more

than one Other response, then you will need to create multiple Other

variables and enter one response in each (e.g., Other1 , Other2 , etc…).

Figure 4. “Tick all that apply” question.

1

Page 7: How to identify variables in your dataset

Created by ASK (2012) Page 7 of 9

This is a ranked response question. Again, only 1 piece of information per participant can

be entered into a spreadsheet, question 4 cannot be treated as one variable. Instead, each

of the items being ranked is a variable. The data for each variable is the rank given by the

participant (i.e., 1 (Most important), 2, 3, 4 or 5 (Least important)). Treat each variable as

independent categorisations or groupings as described in (1) above (see Figure 5 for an

example).

The variables here are Never been , Weather , Surroundings , Cost and

Accomodation . Under each variable, the data for each participant is

recorded as Most important, 2 nd, 3rd, 4 th or Least important.

Figure 5. Ranked response question.

3. Do I have any numeric open response questions? That is, questions in which participants write

in a numeric response rather than tick a category. Treat each variable as a quantitative

measurement as described in (2) above.

EXAMPLES:

I asked participants to write their age in years.

The variable here is Age . Under this variable, the data for each participant is

recorded as their age in years (see Figure 2).

I asked participants to write how many years they have worked at their current job.

The variable here is Years worked . Under this variable, the data for each

participant is recorded as the number of years they have worked at their job.

I asked participants the age at which they plan to retire.

The variable here is Retirement Age . Under this variable, the data for each

participant is recorded as the age (in years) at which they plan to retire.

2

Page 8: How to identify variables in your dataset

Created by ASK (2012) Page 8 of 9

1.4 Arranging data in a spreadsheet

As shown in the examples above, each variable is a column heading. Each row represents a

subject/participant. The data for each participant should go in the corresponding row for each

variable. E.g., all data for subject/participant 1 should go in the row labelled “1”.

For more detail regarding how to arrange data in a spreadsheet so that you can do analysis in SPSS

or Excel, please see How to arrange data in an Excel file on Blackboard.

Page 9: How to identify variables in your dataset

Created by ASK (2012) Page 9 of 9

Additional Resources

In the Getting Started folder under the SPSS resources section, you may be interested in the

following:

1. How to code categorical variables (check this out if you have data from a questionnaire)

2. Levels of measurement (nominal, ordinal and scale variables)

3. How to enter your data into SPSS

4. How to create value labels for categorical variables

5. How to code, replace and define missing values in SPSS

6. How to arrange data in an Excel file (so it can be imported into SPSS)

* If you are unsure about which variables are categorical, have a look at the Levels of

Measurement guide mentioned above.

Return to:

1.1 What is a variable?

1.2 Identifying variables from an experiment

Do I have any independent categorisations or groupings?

Do I have any quantitative measurements taken of my subjects?

Do I have any repeated measures data?

1.3 Identifying variables from a questionnaire

Do I have any single response questions?

Do I have any multiple response questions?

Do I have any numeric open response questions?

1.4 Arranging data in a spreadsheet