R is programming - Evolutionary Biologyevol.bio.lmu.de/_statgen/Rcourse/ws1920/slides/Lecture... ·...

5
x ← 0 For i varying from 1 to 5 then x ← x + i 2 End For What it the value of x at the end of the loop? Answer: 55 R is programming... … you can only be a good programmer if you practice a lot! An introduction to WS 2019/20 Dr. Noémie Becker Dr. Eliza Argyridou Special thanks to: Prof. Dr. Martin Hutzenthaler, Dr. Sonja Grath and Dr. Benedikt Holtmann for significant contributions to course development, lecture notes and exercises Getting Started with R 4 What you should know after day 2 Part I: Getting started What is R How R is organized Installation of R and Rstudio Organize your R session Part II: Basics R as calculator What is a function? What is an assignment? How to plot a continuous function and how to make a scatterplot Getting help 5 What is R? R is a comprehensive statistical environment and programming language for professional data analysis and graphical display. It is a GNU project which is similar to the S language and environment which was developed at Bell Laboratories. Webpage: http://www.r-project.org Advantages: R is free New statistical methods are usually first implemented in R Lots of help due to active community Disadvantages: R has a long learning phase No 'undo' Work with scripts 6 R commands are organized in packages (also called libraries) Examples: base, stats, datasets, ggplot2, dplyr To use a package, it has to be installed AND loaded! Which packages are loaded at start? library(lib.loc=.Library) Which packages are installed? installed.packages() Load package: library(packagename) How to get help: library(help="package") ??package How R is organized Try yourself: library(help="ggplot2") ??ggplot2

Transcript of R is programming - Evolutionary Biologyevol.bio.lmu.de/_statgen/Rcourse/ws1920/slides/Lecture... ·...

Page 1: R is programming - Evolutionary Biologyevol.bio.lmu.de/_statgen/Rcourse/ws1920/slides/Lecture... · 2020. 3. 3. · 3. Open a R session and try the commands we learned today (lecture

x ← 0For i varying from 1 to 5 thenx ← x + i2

End For

What it the value of x at the end of the loop?

Answer: 55

R is programming...

… you can only be a good programmer if you practice a lot!

An introduction toWS 2019/20

Dr. Noémie BeckerDr. Eliza Argyridou

Special thanks to: Prof. Dr. Martin Hutzenthaler, Dr. Sonja Grath and Dr. Benedikt Holtmann for significant

contributions to course development, lecture notes and exercises

Getting Started with R

4

What you should know after day 2

Part I: Getting started

● What is R● How R is organized● Installation of R and Rstudio● Organize your R session

Part II: Basics

● R as calculator● What is a function?● What is an assignment?● How to plot a continuous function and how to make a scatterplot● Getting help

5

What is R?● R is a comprehensive statistical environment and programming

language for professional data analysis and graphical display.

● It is a GNU project which is similar to the S language and environment which was developed at Bell Laboratories.

● Webpage: http://www.r-project.org

Advantages:● R is free● New statistical methods are

usually first implemented in R● Lots of help due to active

communityDisadvantages:● R has a long learning phase● No 'undo'➔ Work with scripts

6

R commands are organized in packages (also called libraries)

Examples: base, stats, datasets, ggplot2, dplyr

To use a package, it has to be installed AND loaded!

Which packages are loaded at start?library(lib.loc=.Library)Which packages are installed?installed.packages()

Load package:library(packagename)

How to get help:library(help="package")??package

How R is organized

Try yourself:library(help="ggplot2")??ggplot2

Page 2: R is programming - Evolutionary Biologyevol.bio.lmu.de/_statgen/Rcourse/ws1920/slides/Lecture... · 2020. 3. 3. · 3. Open a R session and try the commands we learned today (lecture

7

Pre-Defined Datasets

R comes with a huge amount of pre-defined datasets, available in the package ‘datasets’ (usually available at start)

Examples: 'cars', 'mtcars', 'chickwts', ...→ can be used for exercises, demonstration of in-built functions...

How to use a dataset:data(cars)

How to get help on a dataset:?cars

➔ We will use pre-defined datasets in some of the exercises

8

What is R Studio?

● Powerful Integrated Development Environment (IDE) for R

● It is free and open source, and works on Windows, Mac, Linux and over the web

● Webpage: https://www.rstudio.com/

● We will work with RStudio throughout the course, but everything we do would also work in the R console

RProgramming.net

Some benefits:Syntax highlighting, code completion, and smart indentationExecute R code directly from the source editorEasily manage multiple working directories using projectsPlot history, zooming, and flexible image and PDF exportIntegrated R help and documentationSearchable command history

9

boxplot(wing~sex,data=sparrows,xlab='Sex',ylab='Wing length (mm)',col=c("red","blue"),ylim=c(50,70),main="Boxplot of wing lengths")

boxplot(wing ~ sex, data = sparrows, xlab = 'Sex',

ylab = 'Wing length (mm)', col = c("red", "blue"), ylim = c(50, 70),

main = "Boxplot of wing lengths")

RStudio helps with code indentation

10

Organize your R session

General advice:

● Organize your work in folders (e.g., “Rcourse/Day2”)

● Save your commands in scripts

What is a script?

● A computer-readable text file (do not use MS Word or similar)

● For R, the conventional extension is .R

Example:

example.R (see webpage)

11

Questions:1) What does “#” stand for?

2) Who wrote the script and when?

3) Where does the data come from?

4) Which command is used to compute the clutch size mean for the European green woodpecker?

5) How do we get information on the current session?

YOUR TURN

12

How to organize your work and a R sessionA recipe in 15 steps

1) On your computer, create a folder for your project and keep all your material there (e.g. Rcourse/day2)

2) Start Rstudio or a R console

3) Open a new script file or a pre-existing script

4) Save the file to the folder you created in step 1 (file extension .R), save it regularly (!) ('Ctrl+S')

5) Write all your instructions for R into this script file, execute and test your instructions (with button or 'Ctrl+Enter' in RStudio)

6) Always put comments (#) in your script

7) Clean R at the beginning of your work

8) Set your working directory with setwd("path2directory")or over menu in RStudio (Session -> Set working directory)

9) Check your working directory with getwd()10) Load (and install) required packages

● Install with install.packages("name") - only once, need to specify CRAN mirror● Load with library(name) – each session if required

11) Import your data and perform your analyses

12) Output is saved in your working directory (if folder unspecified)

13) Consider sessionInfo() and Sys.time()14) Save your final script

15) Quit your session (q() in console) and save workspace if required

Page 3: R is programming - Evolutionary Biologyevol.bio.lmu.de/_statgen/Rcourse/ws1920/slides/Lecture... · 2020. 3. 3. · 3. Open a R session and try the commands we learned today (lecture

13

What you should know after day 2

Part I: Getting started

● What is R● How R is organized● Installation of R and Rstudio● Organize your R session

Part II: Basics

● R as calculator● What is a function?● What is an assignment?● How to plot a continuous function and how to make a scatterplot● Getting help

14

R as calculator

Basic arithmetic operations

2+3

7-4

3*5

7/3; 2^6

Integer vs. modulo division

5 %/% 3 # 5 divided by 3 without decimal positions

6 %% 2 # if you divide 6 by 2 – what's the rest?

Caution: German (Spanish, French..) decimal notation does not work!> 1,2Error: unexpected ',' in “1,”

> 1.2

15

Functions/Commands

General form:

function()

Examples:

sqrt()exp()sum()prod()...

Functions can have pre-defined parameters/arguments with default settings

→ help page of the function

?read.table()

16

Examples for functions in R

Important functions

exp()sin()cos()max()min()sum()prod()sqrt()factorial()choose()

17

Examples for functions in R

Important functions

exp()sin()cos()max()min()sum()prod()sqrt()factorial()choose()

factorial()“4 factorial”, 4!→ 4*3*2*1

choose()“5 choose 2”, (ab)

exp(1); exp(log(5))sin(pi/2)cos(pi/2)max(4,2,5,1); min(4,2,5,1) sum(4,2,5,1); prod(4,2,5,1)sqrt(16)factorial(4)choose(5,2)

YOUR TURN

18

How to plot a continuous function

You need

● Plotting function● The continuous function you want to plot● Range [a, b]

If fun is a function, then plot(fun, from = a, to = b) plots fun in the range [a, b]

Examples:plot(sin, from = -2*pi, to = 2*pi)plot(dnorm, from = -3, to = 3)

As plotting function you can use

plot()

Page 4: R is programming - Evolutionary Biologyevol.bio.lmu.de/_statgen/Rcourse/ws1920/slides/Lecture... · 2020. 3. 3. · 3. Open a R session and try the commands we learned today (lecture

19

20

AssignmentsGeneral form:variable <- value

Example:x <- 5“The variable 'x' is assigned the value '5'”

Valid variable names: may contain numbers, '_', charactersAllowed:my.variable, my_variable, myVariable, favourite_color, a, b, c, data2, 2test …NOT allowed:

– '.' followed by number at the beginning: .4you

– reserved words, e.g.: if, else, repeat, while, function, for, FALSE, TRUE, …

R is case-sensitive: data, DATA, Data – all different variable names

In general: Use IDE (RStudio) and consider following a style guide for your code

Examples for style guides: http://adv-r.had.co.nz/Style.html Hadley Wickhamhttps://google.github.io/styleguide/Rguide.xml Google Style Guide for R

21

22

Assignments

You can write an assignment in three different ways:x <- 5 # preferred version in scripts5 -> xx = 5

But – have a look here:plot(dnorm, from = -3, to = 3)

Works with longer expressions:x <- 2y <- x^2 + 3z <- x + y

And with complete vectors (more on vectors tomorrow):x <-1:10> x [1] 1 2 3 4 5 6 7 8 9 10y <-x^2 > y [1] 1 4 9 16 25 36 49 64 81 100

23

How to plot numerical vectors

You need

● Plotting function● Two vectors

If x and y are numerical vectors, then plot(x, y) produces a scatterplot of y against x

Examples:x <- 1:10y <- x^2plot(x, y)

As plotting function you can use

plot()

24

Help!

R console

help(solve) # help page for command "solve"

?(solve) # same as help(solve)

help("exp")

help.start()

help.search("solve") # list of commands which could # be related to string "solve"

??solve # same as help.search("solve")

example(exp) # examples for the usage of 'exp'

example("*") # special characters have to be in# quotation marks

Page 5: R is programming - Evolutionary Biologyevol.bio.lmu.de/_statgen/Rcourse/ws1920/slides/Lecture... · 2020. 3. 3. · 3. Open a R session and try the commands we learned today (lecture

25

Take-home message

● It is important to organize your session ● RStudio● Save your commands in a script (.R)● You can use R as a calculator● Basic functions are present in R● Assign values to variables● Look for help

26

What you should do now

If you have your own laptop or computer1. Install R and RStudio (see web tutorials)2. Read the first chapter of “R in Action” (course web page)http://www.manning.com/kabacoff/SampleCh-01.pdf 3. Open a R session and try the commands we learned today (lecture slides)→ if you have trouble with installing R, ask us/the tutors4. Work on the first exercise sheet 5. Go over the solutions of exercise sheet 1

If you don't have your own laptop or computer1. Read the first chapter of “R in Action” (course web page)http://www.manning.com/kabacoff/SampleCh-01.pdf 2. This afternoon in the exercise session: open a R session and try the commands we learned 3. Work on the first exercise sheet 4. Go over the solutions of exercise sheet 1

27

Literature

R in Action Data Analysis and Graphics with R2nd edition (2011)Robert I. Kabacoffhttps://www.manning.com/books/r-in-action-second-edition

Homework: Read Chapter 1 (freely available online as PDF)

Webpagehttp://www.statmethods.net/

… and many, many more (also free web tutorials)

Getting started with RAn Introduction for Biologists(2017)Andrew P. Beckerman, Dylan Z. Childs & Owen L. Petchey