Introduction to R Lecture 1: Getting Started

43
Introduction to R Lecture 1: Getting Started Andrew Jaffe 8/30/10

description

Introduction to R Lecture 1: Getting Started. Andrew Jaffe 8/30/10. Lecture 1. Course overview What is R? Installing R Installing a text editor Interfacing text editor with R Writing scripts Using R as a calculator. About the Course. Series of 7 seminars Covers the usage of R - PowerPoint PPT Presentation

Transcript of Introduction to R Lecture 1: Getting Started

Page 1: Introduction to R Lecture 1: Getting Started

Introduction to RLecture 1: Getting Started

Andrew Jaffe

8/30/10

Page 2: Introduction to R Lecture 1: Getting Started

Lecture 1

• Course overview

• What is R?

• Installing R

• Installing a text editor

• Interfacing text editor with R

• Writing scripts

• Using R as a calculator

Page 3: Introduction to R Lecture 1: Getting Started

About the Course

• Series of 7 seminars

• Covers the usage of R– Platform for beginning analyses– NOT covering statistics – Good programming etiquette

• Bring your laptop – there will be breaks to allow you to practice the code

Page 4: Introduction to R Lecture 1: Getting Started

About the Course

• This seminar is 1 unit pass/fail

• To pass, attend 5 out of 7 seminars

• Very little outside work

Page 5: Introduction to R Lecture 1: Getting Started

About the Course

• Some learning objectives include:– Importing/exporting data– Data management– Performing calculations– Recoding variables– Producing graphics– Installing packages– Writing functions

Page 6: Introduction to R Lecture 1: Getting Started

About the Course

• Course communication via E-mail

• Lectures and code will be hosted on my webpage– http://

www.biostat.jhsph.edu/~ajaffe/rseminar.html

Page 7: Introduction to R Lecture 1: Getting Started

About the Instructor

• 3rd year PhD student in Genetic Epi program, concurrent MHS in Bioinformatics

• Learned R five years ago, been using regularly the last two

Page 8: Introduction to R Lecture 1: Getting Started

Lecture 1

• Course overview• What is R?• Installing R• Installing a text editor• Interfacing text editor with R• Writing scripts• Using R as a calculator• Assignment

Page 9: Introduction to R Lecture 1: Getting Started

What is R?

• R is a language and environment for statistical computing and graphics

• R is the open source implementation of the S language, which was developed by Bell laboratories

• R is both open source and open development

http://www.r-project.org/

Page 10: Introduction to R Lecture 1: Getting Started

What is R?

• Pros:– Free– Tons of packages, very flexible– Multiple datasets at any given time

• Cons:– Much more “programming” oriented– Minimal interface

These are my personal opinions

Page 11: Introduction to R Lecture 1: Getting Started

What is R?

• Often times, a good first step for data cleaning and manipulation

• Then, export data to STATA or SAS for Epi analyses

Page 12: Introduction to R Lecture 1: Getting Started

What is R?

Console Script

Page 13: Introduction to R Lecture 1: Getting Started

Lecture 1

• Course overview• What is R?• Installing R• Installing a text editor• Interfacing text editor with R• Writing scripts• Using R as a calculator• Assignment

Page 14: Introduction to R Lecture 1: Getting Started

Installing R

• http://cran.r-project.org/

Page 15: Introduction to R Lecture 1: Getting Started

Installing R - Windows

• Windows: click “base” and download

Page 16: Introduction to R Lecture 1: Getting Started

Installing R - Windows

• Click the link to the latest build

Page 17: Introduction to R Lecture 1: Getting Started

Installing R - Mac

• Mac: click the latest package’s .pkg file

Page 18: Introduction to R Lecture 1: Getting Started

Installing R

• Double click the downloaded file

• Hit ‘next’ a few times

• Use default settings

• Finish installing

Page 19: Introduction to R Lecture 1: Getting Started

Lecture 1

• Course overview• What is R?• Installing R• Installing a text editor• Interfacing text editor with R• Writing scripts• Using R as a calculator• Assignment

Page 20: Introduction to R Lecture 1: Getting Started

Installing a Text Editor

• Windows: R’s built-in text editor is terrible– It’s essentially Window’s notepad– We will download a much better one

• Mac: R’s built-in text editor is sufficient– Color coding, signals parenthesis closing, etc– I suggest using this until you think you need a

better one

Page 21: Introduction to R Lecture 1: Getting Started

Installing a Text Editor

• I prefer Notepad++: – http://notepad-plus-plus.org/ – Download the current version:

http://download.tuxfamily.org/notepadplus/5.7/npp.5.7.Installer.exe

– Install on your computer using defaults

Page 22: Introduction to R Lecture 1: Getting Started

Installing a Text Editor

Page 23: Introduction to R Lecture 1: Getting Started

Lecture 1

• Course overview• What is R?• Installing R• Installing a text editor• Interfacing text editor with R• Writing scripts• Using R as a calculator• Assignment

Page 24: Introduction to R Lecture 1: Getting Started

Interfacing with R

• Scripts: documents that contain reproducible R code and functions that you can send to the console (and save)– Files are designated with the “.R” extension– You can “source” scripts (more later)

• Console: Type commands directly into the console– Good for looking at your data, trying things,

and plotting

Page 25: Introduction to R Lecture 1: Getting Started

Interfacing with R - Mac

• Mac: File New Script

• This opens the default text editor

• To send a line of code to the R console, press Apple+Enter when the cursor is anywhere on that line

• Highlight chunks of code and press Apple+Enter to send

Page 26: Introduction to R Lecture 1: Getting Started

Interfacing with R - Windows

• Using the default text editor, pressing Ctrl+R sends lines to the console

• However, we want to use Notepad++

• We need to download one more thing…

Page 27: Introduction to R Lecture 1: Getting Started

Interfacing with R - Windows

• “NppToR”: Notepad++ to R

• http://sourceforge.net/projects/npptor/

• It must be running when R and Notepad++ are open

• When properly configured, press F8 to send lines of code, or highlighted chunks, to the console

• I will help configure this after class today

Page 29: Introduction to R Lecture 1: Getting Started

Lecture 1

• Course overview• What is R?• Installing R• Installing a text editor• Interfacing text editor with R• Writing scripts• Using R as a calculator• Assignment

Page 30: Introduction to R Lecture 1: Getting Started

Writing Scripts

• The comment symbol is # (pound) in R

• Comment liberally - you should be able to understand a script after not seeing it for 6 months

• Lines of #’s are useful to separate sections

• Useful for designating headers

Page 31: Introduction to R Lecture 1: Getting Started

Writing Scripts

################## Title: Demo R Script# Author: Andrew Jaffe# Date: 7/30/10# Purpose: Demonstrate comments in R################### # this is a comment, nothing to the right of it gets read# this # is still a comment – you can use many #’s as you want

# sometimes you have a really long comment, like explaining what you

# are doing for a step in analysis. Take it to a second line

Page 32: Introduction to R Lecture 1: Getting Started

Writing Scripts

• Some common etiquette:– You can use spaces (more generally “white

space”) within functions and commands liberally as well

– Try to keep a reasonable number of characters per column – many commands can be broken into multiple lines

– More to come later…

Page 33: Introduction to R Lecture 1: Getting Started

Lecture 1

• Course overview• What is R?• Installing R• Installing a text editor• Interfacing text editor with R• Writing scripts• Using R as a calculator• Assignment

Page 34: Introduction to R Lecture 1: Getting Started

R as a Calculator

• The R console functions as full calculator

• Try to play around with it:+, -, /, * are add, subtract, multiply, and divide

^ or ** is power

( and ) work with order of operations

Page 35: Introduction to R Lecture 1: Getting Started

Lecture 1

• Course overview• What is R?• Installing R• Installing a text editor• Interfacing text editor with R• Writing scripts• Using R as a calculator• Assignment

Page 36: Introduction to R Lecture 1: Getting Started

Assignment

• The assignment… operator: assigning a value to a name

• R accepts two operators “<-” and “=“– Ie: x=8 (remember whitespace!: x = 8, x <- 8)

• Variable names are case-sensitive– Ie: X and x are different

• Set x = 8, and try using calculator functions on x

Page 37: Introduction to R Lecture 1: Getting Started

Assignment

• ‘Assignment’ literally puts whatever is on the right side of the operator into your left-hand side variable– Note that although you can name variables

anything, you might run into some issues naming things the same as default R functions Np++ turns functions red/pink so you know…

Page 38: Introduction to R Lecture 1: Getting Started

Examples of assignment, introducing R data

Enough to get R up and running if this is the only class you attend. We will

see them in much more detail over the next three sessions

Page 39: Introduction to R Lecture 1: Getting Started

Assignment

• status <- c(“case”,”case”,”case”, “control”,”control”,”control”)

status

class(status)

table(status)

factor(status)

[alternatively: status <- c(rep(“case”,3), rep(“control”,3))]

Page 40: Introduction to R Lecture 1: Getting Started

Assignment

• web <- “http://www.biostat.jhsph.edu/~ajaffe/code/lec1_code.R”– class(web)– source(web)

• You also don’t have to save tables/data you find online to your disk (note read.table works for most things – below aren’t tables though) – scan(web, what=character(0), sep = "\n")– scan(“http://www.google.com”, what=character(0))

Page 41: Introduction to R Lecture 1: Getting Started

Assignment

mat <- matrix(c(1,2,3,4), nrow = 2, ncol = 2, byrow = T) # this is sourced in

class(mat)matmat + matmat * matmat %*% mat

Page 42: Introduction to R Lecture 1: Getting Started

Assignment

• class(dat) # dat is also sourced in

• head(dat)

• table(dat$sex, dat$status)

• …To be continued…

Page 43: Introduction to R Lecture 1: Getting Started

Questions?