Getting started with R

Post on 19-Jun-2015

904 views 2 download

Tags:

description

Presentation by Jacob van Etten.CCAFS workshop titled "Using Climate Scenarios and Analogues for Designing Adaptation Strategies in Agriculture," 19-23 September in Kathmandu, Nepal.

Transcript of Getting started with R

Getting started with R

Jacob van Etten

What is R?

“R is a free software environment for statistical computing and graphics.”

www.r-project.org

Why R?

R takes time to learn and use.

So why should I bother?

There are more user-friendly programmes, right?

12 reasons to learn R

1. Rigour and strategy in data analysis – not “thinking after clicking”.

2. Automatizing repeated calculations can save time in the end.

3. A lot of stuff is simply not feasible in menu driven software.

12 reasons to learn R

4. R is not only about software, it is also an online community of user support and collaboration.

5. Scripts make it easy to communicate about your problem. Important for collaborative research!

6. Research becomes replicable when scripts are published.

12 reasons to learn R

7. R packages represent state-of-the-art in many academic fields.

8. Graphics in R are very good.

9. R stimulates learning – graduation from user to developer.

12 reasons to learn R

10. R is free, which saves you money. Or R may be the only option when budgets are restricted.

11. R encourages to freely explore new methods and learn about them.

12. Knowing to work with R is a valuable and transferable skill.

It’s the long way but it’s worth it...

Resources to learn R My two picks for you

http://pj.freefaculty.org/R/Rtips.html

Some R packages of interest

RNCEP – Reanalysis data

clim.pact – Downscaling climate data

GhcnDaily – daily weather data

weatherData (R-Forge) – Daily weather data and derived bioclimatic variables relevant to plant growth

raster – gives access to WorldClim data

And a lot more here...

http://cran.r-project.org/other-docs.html

Downloading R

Choose a mirror nearby and then...

Binaries“When downloading, a completely functional program without any installer is also often called program binary, or binaries (as opposed to the source code).”(Wikipedia)

And finally,you can download R...

When downloading has finished, run the installer

The bare R interface

RStudio makes life easier

Rstudio.org

Four parts

Scripts, documentation

Console

Files, plots, packages and help

Workspace/history

Create a new R script: File – New – R Script

Our first code...

Type

1 + 1

into the script area.

Then click “Run”.

What happens?

Exercises: running code

Type a second line with another calculation (use “-”, “/”, or “*”) and click “Run” again.

Select only one line with the mouse or Shift + arrows. Then click “Run”.

Save your first code to a separate folder “Rexercises”.

Following exercises

In the next exercises, we will develop a script.

Just copy every new line and add it to your script, without erasing the previous part.

If you want to make a comment in your script, put a # before that line. Like this:

#important to remember: use # to comment

If the exercises are a bit silly...

...that’s because you are learning.

Vector

Type a new line with the expression

1:10

in the script and run this line.

A concatenation of values is called a vector.

Making a new variable

If we send 1:10 into the console it will only print the outcome. To “store” this vector, we need to do the following.

a <- 1:10

new variable “a” assign vector values 1 to 10

Operations with vectors

Try the following and see what happens.aa * 2a * ab <- a * abprint(b)

Other ways of making vectors

d <- c(1, 6, 9)dclass(d)f <- LETTERSfclass(f)

What is the difference between d and f?

Functions

Actually, we have already seen functions!Functions consist of a name followed by one or more arguments. Commas and brackets complete the expression.

class(f)c(d,f)

name argument

Cheat sheet

When you use R, you will become familiar with the most common functions.

If you need a less common function, there are ways to discover the right one.

For now, use the cheat sheet to look up the functions you need.

Getting help on functions

This will open help pages for the functions in your browser.

?c?class

Especially the examples are often helpful. Just copy and paste the code into the console and see with your own eyes what happens!

Matrices

We have already met the vector.If we put two or more vector together as columns, we get a matrix.X <- c(1,2,3)Y <- c(8,9,7)Z <- c(4,2,8)M <- cbind(X, Y, Z)How many columns and rows does M have?

Data frames

Matrices must consist of values of the same class. But often datasets consist of a mix of different types of variables (real numbers and groups). This is the job of data frames.

L <- c(“a”, “b”, “c”)Df <- data.frame(X,Y,Z,L)

Visualize Df like this: str(Df)

What would happen if you tried to make a matrix out of these same vectors instead? Try and see.

Getting data into R

?read.csv

CSV files are a relatively trouble-free way of getting data into R.

It is a fairly common format.

You can make a CSV file in any spreadsheet software.

Create a CSV fileFirst name Family name Sex Age

John Travolta Male 57

Elijah Wood Male 30

Nicole Kidman Female 44

Keira Knightley Female 26

Add your own favorite actor, too.

Open the file with Notepad.

Make sure the values are separated by commas.

Now use R to read it

Now read it into R.actors <- read.csv(yourfile.csv)str(actors)

Subsetting

There are many ways of selecting only part of a data frame. Observe carefully what happens.actors[1:2,]actors[,1:2]actors[“Age”]actors[c(“Name”, “Age”)]subset(actors, Age> 40)Now create a new data frame with the actors younger than 45.

Graphics

The plot function makes graphs.

plot(actors[c(“sex”, “Age”)])

Summary

You now know about:

VariablesFunctionsVectorsMatricesData framesGetting tabular data into RSubsettingSimple plotting

Time for your first fight...