R Commander Tutorial -...

16
1 R Commander Tutorial Introduction R is a powerful, freely available software package that allows analyzing and graphing data. However, for somebody who does not frequently use statistical software packages, the big drawback of R is that it is command line based and thus not very intuitive to use. For users who do not use statistical software very often, R commander might be a good alternative. The R commander is a software package that allows running R from a graphical user interface. This makes analyzing and graphing your data in R a lot easier. Objective The objective of this tutorial is to give you a basic introduction to R Commander and how to use it to run basic statistics and create graphs. 1. Start the R Commander Open R by either clicking on the R icon on your desktop or by navigating to R in your programs folder. Once you opened R, go to Packages/Load Packages … on the R menu bar and find Rcmdr in the R packages list (R packages are similar to software programs that have been written by different contributors for R). Highlight Rcmdr by clicking on it and click OK. R might give you a warning message. If so, just ignore it and click No. The R Commander console should now appear on your screen and you are ready to run some statistics and make some graphs in R.

Transcript of R Commander Tutorial -...

Page 1: R Commander Tutorial - ecosensing.orgecosensing.org/wp-content/uploads/2012/04/R-commander-tutorial1.pdf · 1 R Commander Tutorial Introduction R is a powerful, freely available software

1

R Commander Tutorial

Introduction

R is a powerful, freely available software package that allows analyzing and graphing data. However, for

somebody who does not frequently use statistical software packages, the big drawback of R is that it is

command line based and thus not very intuitive to use. For users who do not use statistical software

very often, R commander might be a good alternative. The R commander is a software package that

allows running R from a graphical user interface. This makes analyzing and graphing your data in R a lot

easier.

Objective

The objective of this tutorial is to give you a basic introduction to R Commander and how to use it to run

basic statistics and create graphs.

1. Start the R Commander

Open R by either clicking on the R icon on your desktop or by navigating to R in your programs folder.

Once you opened R, go to Packages/Load Packages … on the R menu bar and find Rcmdr in the R

packages list (R packages are similar to software programs that have been written by different

contributors for R). Highlight Rcmdr by clicking on it and click OK.

R might give you a warning message. If so, just ignore it and click No. The R Commander console should

now appear on your screen and you are ready to run some statistics and make some graphs in R.

Page 2: R Commander Tutorial - ecosensing.orgecosensing.org/wp-content/uploads/2012/04/R-commander-tutorial1.pdf · 1 R Commander Tutorial Introduction R is a powerful, freely available software

2

2. Reading your data into R

After you come back from the field, your notebook shows the following data recordings:

Now you want to create a digital copy of your data. To do this, start your computer and type the data

table into notepad or another text editor of your choice and save the data table on your hard drive

(Important: Data have to be separated by commas as shown below). Make sure you remember where

you save it so you can navigate to the dataset later on.

Page 3: R Commander Tutorial - ecosensing.orgecosensing.org/wp-content/uploads/2012/04/R-commander-tutorial1.pdf · 1 R Commander Tutorial Introduction R is a powerful, freely available software

3

On the R Commander menu bar, go to Data/Import data and select from text file, clipboard, or URL …

which should bring up the window below. Make the same selections as shown in the window below (e.g.

name your data set cover_moisture and select Commas as your field separator since we separated our

data by commas when we entered them into our text editor earlier).

Click OK and a window appears that allows you to navigate to your data file. Once you navigated to your

data file, highlight it by clicking on it and click Open. You can now view your data by clicking on View

data set on the R Commander menu bar.

You can also directly enter your data into R by selecting Data from the R Commander menu bar and

clicking on New dataset ….This will bring up the following window.

The Data Editor window appears that allows you to directly enter your data into R. By clicking on the

column header, you can change the variable name of each column (e.g. change var1 to location, var2 to

Page 4: R Commander Tutorial - ecosensing.orgecosensing.org/wp-content/uploads/2012/04/R-commander-tutorial1.pdf · 1 R Commander Tutorial Introduction R is a powerful, freely available software

4

cover, and var3 to soil moisture). The variable editor also allows you to select the type of your variables

you are entering. Since you are entering numeric values, select numeric under variable type. Type in

your data as shown below.

3. Summary statistics

To get some summary statistics of your data, go to Statistics/Summaries and select Numerical

summaries …. Now you should see the following window:

Pick cover and soil.moisture (Note: to select more than one variable you have to hold down the Ctrl key)

and click OK. A summary table will appear that shows the mean, standard deviation, and the 0, 0.25,

0.50, 0.75, 1 quantiles of the cover and soil.moisture data.

4. Scatterplot

To see if there is a relationship between cover and soil moisture it might be a good idea to look at a

scatterplot of the data. To create a scatterplot, go to Graphs on the R Commander menu bar and select

Scatterplot …. This will bring up a table. Select cover as you x-variable and soil moisture as your y-

variable. Lable your x- and y-axis Cover (%) and Soil Moisture %, respectively. Next, click OK and a

scatterplot will appear (Important: Make sure you highlight the R Console by clicking on it to be able to

see the scatterplot). You can save the scatterplot (or any other plot you create) by clicking on the plot

Page 5: R Commander Tutorial - ecosensing.orgecosensing.org/wp-content/uploads/2012/04/R-commander-tutorial1.pdf · 1 R Commander Tutorial Introduction R is a powerful, freely available software

5

(Important: if you do not select the plot you won’t be able to save it) and on the R menu bar (Note: R

menu bar and not the R commander menu bar) going to File/Save as/Jpeg and click on 100% quality … .

This will bring up a window that allows you to specify the location on your computer where you want to

save the plot as a Jpeg image.

5. Fitting a linear regression model

The scatterplot above shows us that there is a positive relationship between soil moisture and cover.

However, the scatterplot does not tell us how strong the relationship is, if the relationship is significant

etc. To get this information we do have to fit a linear regression model. To fit a linear regression model

go to Statistics/Fit models on the R Commander menu bar and select Linear model … . Select soil

moisture as your response variable (aka y- variable or dependent variable) and cover as your

explanatory variable (aka x-variable or independent variable) and click OK.

Page 6: R Commander Tutorial - ecosensing.orgecosensing.org/wp-content/uploads/2012/04/R-commander-tutorial1.pdf · 1 R Commander Tutorial Introduction R is a powerful, freely available software

6

The following output will appear in the Output Window of the R Commander:

We will talk in class how to interpret the output table (e.g. what do those numbers mean).To check the

basic model diagnostics for the linear model you just fit, go to Models/Graphs on the R Commander

menu bar and select Basic diagnostic plots. This brings up the following window (We will discuss in class

how to interpret the model diagnostic plot):

Page 7: R Commander Tutorial - ecosensing.orgecosensing.org/wp-content/uploads/2012/04/R-commander-tutorial1.pdf · 1 R Commander Tutorial Introduction R is a powerful, freely available software

7

6. Fitting multiple regression models

In this part of the tutorial you learn how to fit a multiple regression model. Your hypothesis is that air

temperature, solar radiation, and wind speed are significant predictors of ozone. To test this hypothesis,

you collected the data called “airquality.txt” that are available in the class Dropbox folder

(C:\...\Dropbox\Jan Teaching Files\CSS 560\Data\R Commander\airquality.txt) (Note: The data was

taken from Daalgard, 2002). Let's import the data into R commander and call the dataset airquality (if

you can't remember how to import data please refer to x.x in the document). Let's take a look at the

data to familiarize ourselves with the data by selecting airquality from the Data set dropdown menu.

Next, let's plot the relationships between the different variables in the dataset. To do this, make the R

Console active by clicking on it and type the following command into the R Console command line

prompt: pairs(airqualit).

Page 8: R Commander Tutorial - ecosensing.orgecosensing.org/wp-content/uploads/2012/04/R-commander-tutorial1.pdf · 1 R Commander Tutorial Introduction R is a powerful, freely available software

8

Now you should see the following figure:

This is how you read the figure:

It looks like there is some sort of relationship between ozone and temperature and ozone and wind.

However, there seems to be no relationship between ozone and solar radiation.

OK - let's now fit a multiple regression model to test if solar radiation, wind, and temperature are

significant predictors of ozone. To fit a multiple regression model let's go to Statistics/Fit models... on

Page 9: R Commander Tutorial - ecosensing.orgecosensing.org/wp-content/uploads/2012/04/R-commander-tutorial1.pdf · 1 R Commander Tutorial Introduction R is a powerful, freely available software

9

the R Commander menu bar and select Linear model... . A window appears that should be somewhat

familiar to you from section 5 of this tutorial. The model you want to fit basically says that ozone is a

function of solar radiation, air temperature, and wind. Mathematically, we can write this model as

follows:

Ozone ~ Solar.R + Temp + Wind [1]

After typing model [1] in the appropriate section of the linear model window (see above) click OK. You

should now see the following output:

Page 10: R Commander Tutorial - ecosensing.orgecosensing.org/wp-content/uploads/2012/04/R-commander-tutorial1.pdf · 1 R Commander Tutorial Introduction R is a powerful, freely available software

10

Let's also take a look at the model diagnostics:

We will discuss the interpretation of the model output as well the interpretation of the model

diagnostics in more detail in class.

7. Paired t-test

Next, we will to conduct a paired t-test to see if there is a statistical significant difference in soil moisture

before and after a rain event. The data for the paired t-test is in the class Dropbox folder

(C:\Users\Jan\Dropbox\Jan Teaching Files\CSS 560\Data\R Commander\paired _t_test.txt). Import the

data into R by following the steps you learned about at the beginning of this tutorial and name the

dataset soil_moisture (Hint: Open the paired_t_test.txt file in a text editor. You will see that the

paired_t_test.txt file is a tab delimited file and not comma delimited file. You need that information to

properly import the data into R).

Before conducting a paired t-test (and any other t-test) it might be a good idea to look at a

boxplot of the data first. To do this you do have to stack your data first (you just re-arranging the data so

they are in a format that can be used by the computer to create a boxplot of your data) by going to

Data/Active data set on the R Commander menu bar and click on Stack variables in active data set … .

Page 11: R Commander Tutorial - ecosensing.orgecosensing.org/wp-content/uploads/2012/04/R-commander-tutorial1.pdf · 1 R Commander Tutorial Introduction R is a powerful, freely available software

11

You should now see the Stack Variables window shown below. Select both the soil.moisture.after and

soil.moisture.before variables and name the stacked dataset stacked_soil_moisture. Keep the rest of the

default settings as shown below and click OK.

Next, go to Graphs/Boxplots… on the R Commander menu bar. In the window that pops up select Plot by

groups… and group your variables by factor and click OK. Now you should see the following boxplot:

Based on the boxplot, do you think the soil moisture changed significantly after the rain event?

After visually looking at the data we are ready to run a paired t-test. To do this, let’s go back to our

original, unstacked dataset by going to Data set on the R Commander menu bar and selecting

soil_moisture. Click OK.

Page 12: R Commander Tutorial - ecosensing.orgecosensing.org/wp-content/uploads/2012/04/R-commander-tutorial1.pdf · 1 R Commander Tutorial Introduction R is a powerful, freely available software

12

Next, go to Statistics/Means on the R Commander menu bar and select Paired t-test … .

Next, select soil.moisture.before as your first variable and soil.moisture.after as you second variable.

Keep the rest at the default settings as shown below.

After clicking OK you should get the following output. We will discuss in class how to interpret the

output.

8. Two-sample t-test

In this section of the tutorial we will learn how to conduct a two sample t-test. We want to test the

following hypothesis: soil pH of the non treated stand in the Ponderosa State Park is statistically

Page 13: R Commander Tutorial - ecosensing.orgecosensing.org/wp-content/uploads/2012/04/R-commander-tutorial1.pdf · 1 R Commander Tutorial Introduction R is a powerful, freely available software

13

significantly different than the soil pH in the treated part of the Park. The hypothetical data that were

collected are available in the class Dropbox folder (C:\...\Dropbox\Jan Teaching Files\CSS 560\Data\R

Commander\ph.txt).

Let's import the data into the R commander and create a boxplot of the data as we learned in section 7

of this tutorial (remember: you first have to stack the data in order to create the boxplot below. For

more details please refer to section 7 of this tutorial).

OK - it looks like the soil pH in the non treated part of the forest is lower than in the treated part. Let's

now do a two-sample t-test to see if the soil pH are statistically significantly different from each other.

To do this, keep your stacked pH dataset active and go to the R Commander menu bar and select

Statistics/Means and select Independent samples t-test... (in case Independent samples t-test... option is

greyed out make sure you i) stacked the pH dataset and ii) that the stacked pH dataset is the active

dataset) .

Page 14: R Commander Tutorial - ecosensing.orgecosensing.org/wp-content/uploads/2012/04/R-commander-tutorial1.pdf · 1 R Commander Tutorial Introduction R is a powerful, freely available software

14

The window that now appears should look similar to the one below:

Keep the default settings and click OK. Now you should see the following output:

We will discuss in the class how to interpret the output.

9. Customize your graphs

If you want to customize your figures, you do have to do a little bit of programming. For example, the

boxplot you creaed in section 8 of this tutorial is associated with the following line of code in your R

Commander script window:

boxplot(variable ~ factor, ylab = "pH", xlab="factor", data = pH_stacked)

Page 15: R Commander Tutorial - ecosensing.orgecosensing.org/wp-content/uploads/2012/04/R-commander-tutorial1.pdf · 1 R Commander Tutorial Introduction R is a powerful, freely available software

15

We can now change this line of code some to make the boxplot a little nicer. For example, we could type

the following into the R Console:

boxplot(variable ~ factor, ylab = "Soil pH", xlab = "", names = c("Treated Forest", "Untreated Forest"),

data = pH_stacked)

If you write the code above into the R Console and hit enter you should see the following boxplot:

It becomes clear that you need some R programming experience and knowledge to change the

appearance of the figure beyond what the R Commander allows you to do. If you do want to learn more

about how to program in R, the R website is a good starting point (http://www.r-project.org/ ) as well as

Peter Dalgaard's book "Introductory Statistics in R".

10. Closing R Commander and R

To close the R Commander and R, go to File/Exit and select From Commander and R.

Page 16: R Commander Tutorial - ecosensing.orgecosensing.org/wp-content/uploads/2012/04/R-commander-tutorial1.pdf · 1 R Commander Tutorial Introduction R is a powerful, freely available software

16

Next, the R Commander will ask you if you want to exit the program. Click OK. Next it will ask you if you

want to save the script file and the output file. Click No in both cases.

Congratulations - you successfully finished the R Commander tutorial.

Other resources

“Getting started with the R Commander”. You can find a pdf of this tutorial on our class website

(http://ecosensing.org/teaching/css-560/digital-library/tutorials). If you want to learn more about the R

commander I recommend you working through this tutorial.

Literature cited

Dalgaard, Peter. 2002. Introductory Statistics in R. Springer Science and Business Media, Inc.

Disclaimer

Always consult a trained statistician to validate the correctness of the statistical approach you are

taking. Please e-mail any suggestions of how to potentially improve this document to Jan Eitel (jeitel@

uidaho.edu). Use of trade names does not constitute an official endorsement by the McCall Outdoor

Science School.

Important: If you used a MOSS computer for this tutorial, please make sure you delete all the files you

created from the computer after you are done with the tutorial. Thanks!