Introduction to R Graphics with ggplot2

60
Introduciton to R Graphics with ggplot2 Harvard MIT Data Center May 10, 2013 e Institute for Quantitative Social Science at Harvard University (Harvard MIT Data Center) Introduciton to R Graphics with ggplot2 May 10, 2013 1 / 60

description

This introduction to the popular ggplot2 R graphics package will show you how to create a wide variety of graphical displays in R. Data sets and additional workshop materials available at http://projects.iq.harvard.edu/rtc/event/r-graphics

Transcript of Introduction to R Graphics with ggplot2

Page 1: Introduction to R Graphics with ggplot2

Introduciton to R Graphics with ggplot2

Harvard MIT Data Center

May 10, 2013

�e Institutefor Quantitative Social Scienceat Harvard University

(Harvard MIT Data Center) Introduciton to R Graphics with ggplot2 May 10, 2013 1 / 60

Page 2: Introduction to R Graphics with ggplot2

Outline

1 Introduction

2 Geometric Objects And Aesthetics

3 Statistical Transformations

4 Scales

5 Faceting

6 Themes

7 The #1 FAQ

8 Putting It All Together

9 Wrap-up

(Harvard MIT Data Center) Introduciton to R Graphics with ggplot2 May 10, 2013 2 / 60

Page 3: Introduction to R Graphics with ggplot2

Introduction

Topic

1 Introduction

2 Geometric Objects And Aesthetics

3 Statistical Transformations

4 Scales

5 Faceting

6 Themes

7 The #1 FAQ

8 Putting It All Together

9 Wrap-up

(Harvard MIT Data Center) Introduciton to R Graphics with ggplot2 May 10, 2013 3 / 60

Page 4: Introduction to R Graphics with ggplot2

Introduction

Class Files And Administrative Details

User name: dataclassPassword: dataclassCopy Rgraphics folder from shared drive to your desktopClass Structure and Organization

Ask questions at any time. Really!Collaboration is encouragedThis is your class! Special requests are encouraged

This is an intermediate R courseAssumes working knowledge of RRelatively fast-pacedFocus is on ggplot2 graphics–other packages will not be covered

(Harvard MIT Data Center) Introduciton to R Graphics with ggplot2 May 10, 2013 4 / 60

Page 5: Introduction to R Graphics with ggplot2

Introduction

Starting A The End

My goal: by the end of the workshop you will be able to reproduce thisgraphic from the Economist:

(Harvard MIT Data Center) Introduciton to R Graphics with ggplot2 May 10, 2013 5 / 60

Page 6: Introduction to R Graphics with ggplot2

Introduction

Why ggplot2?

Advantages of ggplot2Consistent underlying grammar of graphics (Wilkinson, 2005)Plot specification at a high level of abstractionVery flexibleTheme system for polishing plot appearanceActive maintenance and development–getting better all the timeMany users, active mailing list

Things you cannot do With ggplot23-dimensional graphicsGraph-theory type graphs (nodes/edges layout)

(Harvard MIT Data Center) Introduciton to R Graphics with ggplot2 May 10, 2013 6 / 60

Page 7: Introduction to R Graphics with ggplot2

Introduction

What Is The Grammar Of Graphics?

The basic idea: independently specify plot building blocksAnatomy of a plot:

dataaesthetic mappinggeometric objectstatistical transformationsscalescoordinate systemposition adjustmentsfaceting

(Harvard MIT Data Center) Introduciton to R Graphics with ggplot2 May 10, 2013 7 / 60

Page 8: Introduction to R Graphics with ggplot2

Introduction

Example data I: mtcars

> print(head(mtcars, 4))mpg cyl disp hp drat wt qsec vs am gear carb

Mazda RX4 21.0 6 160 110 3.90 2.62 16.5 0 1 4 4Mazda RX4 Wag 21.0 6 160 110 3.90 2.88 17.0 0 1 4 4Datsun 710 22.8 4 108 93 3.85 2.32 18.6 1 1 4 1Hornet 4 Drive 21.4 6 258 110 3.08 3.21 19.4 1 0 3 1

mpg Miles/(US) galloncyl Number of cylindersdisp Displacement (cu.in.)hp Gross horsepowerdrat Rear axle ratiowt Weight (lb/1000)qsec 1/4 mile timevs V/Sam Transmission (0 = automatic, 1 = manual)gear Number of forward gearscarb Number of carburetors

(Harvard MIT Data Center) Introduciton to R Graphics with ggplot2 May 10, 2013 8 / 60

Page 9: Introduction to R Graphics with ggplot2

Introduction

ggplot2 VS Base Graphics

Compared to base graphics, ggplot2is more verbose for simple / canned graphicsis less verbose for complex / custom graphicsdoes not have methods (data should always be in a data.frame)uses a different system for adding plot elements

(Harvard MIT Data Center) Introduciton to R Graphics with ggplot2 May 10, 2013 9 / 60

Page 10: Introduction to R Graphics with ggplot2

Introduction

ggplot2 VS Base Graphics

Base graphics VS ggplot for simple graphs:

hist(mtcars$mpg) ggplot(mtcars, aes(x = mpg)) +geom_histogram(binwidth = 5)

(Harvard MIT Data Center) Introduciton to R Graphics with ggplot2 May 10, 2013 10 / 60

Page 11: Introduction to R Graphics with ggplot2

Introduction

ggplot2 VS Base Graphics

Base graphics VS ggplot for complex graphs:

par(mar = c(4,4,.1,.1))plot(mpg ~ hp,

data=subset(mtcars, am==1),xlim=c(50, 450),ylim=c(5, 40))

points(mpg ~ hp, col="red",data=subset(mtcars, am==0))

legend(350, 40,c("1", "0"), title="am",col=c("black", "red"),pch=c(1, 1))

ggplot(mtcars, aes(x=hp,y=mpg,color=factor(am)))+

geom_point()

#

(Harvard MIT Data Center) Introduciton to R Graphics with ggplot2 May 10, 2013 11 / 60

Page 12: Introduction to R Graphics with ggplot2

Geometric Objects And Aesthetics

Topic

1 Introduction

2 Geometric Objects And Aesthetics

3 Statistical Transformations

4 Scales

5 Faceting

6 Themes

7 The #1 FAQ

8 Putting It All Together

9 Wrap-up

(Harvard MIT Data Center) Introduciton to R Graphics with ggplot2 May 10, 2013 12 / 60

Page 13: Introduction to R Graphics with ggplot2

Geometric Objects And Aesthetics

Aesthetic Mapping

In ggplot land aesthetic means "something you can see"Examples include:

position (i.e., on the x and y axes)color ("outside" color)fill ("inside" color)shape (of points)linetypesize

Each type of geom accepts only a subset of all aesthetics–refer to thegeom help pages to see what mappings each geom acceptsAesthetic mappings are set with the aes() function

(Harvard MIT Data Center) Introduciton to R Graphics with ggplot2 May 10, 2013 13 / 60

Page 14: Introduction to R Graphics with ggplot2

Geometric Objects And Aesthetics

Geometic Objects (geom)

Geometric objects are the actual marks we put on a plotExamples include:

points (geom_point, for scatter plots, dot plots, etc)lines (geom_line, for time series, trend lines, etc)boxplot (geom_boxplot, for, well, boxplots!)

A plot must have at least one geom; there is no upper limitAdd a geom to a plot using the + operatorYou can get a list of available geometric objects:> geoms <- help.search("geom_", package = "ggplot2")

> geoms$matches[1:4, 1:2]topic title

[1,] "geom_abline" "Line specified by slope and intercept."[2,] "geom_area" "Area plot."[3,] "geom_bar" "Bars, rectangles with bases on x-axis"[4,] "geom_bin2d" "Add heatmap of 2d bin counts."

(Harvard MIT Data Center) Introduciton to R Graphics with ggplot2 May 10, 2013 14 / 60

Page 15: Introduction to R Graphics with ggplot2

Geometric Objects And Aesthetics

Points (Scatterplot)

Now that we know about geometric objects and aesthetic mapping, wecan make a ggplot

ggplot(mtcars, aes(x = hp, y = mpg)) +geom_point()

ggplot(mtcars, aes(x = hp, y = mpg)) +geom_point(aes(y=log10(mpg)))

(Harvard MIT Data Center) Introduciton to R Graphics with ggplot2 May 10, 2013 15 / 60

Page 16: Introduction to R Graphics with ggplot2

Geometric Objects And Aesthetics

Lines (Prediction Line)

A plot constructed with ggplot can have more than one geomOur hp vs mpg plot could use a regression line:

mtcars$pred.mpg <- predict(lm(mpg ~ hp, data = mtcars))

p1 <- ggplot(mtcars, aes(x = hp, y = mpg))

p1 + geom_point(aes(color = wt)) +geom_line(aes(y = pred.mpg))

(Harvard MIT Data Center) Introduciton to R Graphics with ggplot2 May 10, 2013 16 / 60

Page 17: Introduction to R Graphics with ggplot2

Geometric Objects And Aesthetics

Smoothers

Not all geometric objects are simple shapes–the smooth geom includes aline and a ribbon

p2 <- ggplot(mtcars, aes(x = hp, y = mpg))

p2 + geom_point(aes(color = wt)) +geom_smooth()

(Harvard MIT Data Center) Introduciton to R Graphics with ggplot2 May 10, 2013 17 / 60

Page 18: Introduction to R Graphics with ggplot2

Geometric Objects And Aesthetics

Text (Label Points)

Each geom accepts a particualar set of mappings–for examplegeom_text() accepts a labels mapping

p2 + geom_point(aes(color = wt)) +geom_smooth() +geom_text(aes(label=rownames(mtcars)), size=2)

(Harvard MIT Data Center) Introduciton to R Graphics with ggplot2 May 10, 2013 18 / 60

Page 19: Introduction to R Graphics with ggplot2

Geometric Objects And Aesthetics

Aesthetic Mapping VS Assignment

Note that variables are mapped to aesthetics with the aes() function,while fixed aesthetics are set outside the aes() callThis sometimes leads to confusion, as in this example:

ggplot(mtcars, aes(x = hp, y = mpg)) +geom_point(aes(size = 2),# incorrect! 2 is not a variable

color="red") # this is fine -- all points red

(Harvard MIT Data Center) Introduciton to R Graphics with ggplot2 May 10, 2013 19 / 60

Page 20: Introduction to R Graphics with ggplot2

Geometric Objects And Aesthetics

Mapping Variables To Other Aesthetics

Other aesthetics are mapped in the same way as x and y in the previousexample

ggplot(mtcars, aes(x = hp, y = mpg)) +geom_point(aes(color=wt, shape = factor(am)))

(Harvard MIT Data Center) Introduciton to R Graphics with ggplot2 May 10, 2013 20 / 60

Page 21: Introduction to R Graphics with ggplot2

Geometric Objects And Aesthetics

Exercise I

1 Create a scatter plot with displacement on the x axis and horse poweron the y axis

2 Color the points in the previous plot blue3 Color the points in the previous plot according to miles per gallon4 Exercise I prototype :noexport:

(Harvard MIT Data Center) Introduciton to R Graphics with ggplot2 May 10, 2013 21 / 60

Page 22: Introduction to R Graphics with ggplot2

Statistical Transformations

Topic

1 Introduction

2 Geometric Objects And Aesthetics

3 Statistical Transformations

4 Scales

5 Faceting

6 Themes

7 The #1 FAQ

8 Putting It All Together

9 Wrap-up

(Harvard MIT Data Center) Introduciton to R Graphics with ggplot2 May 10, 2013 22 / 60

Page 23: Introduction to R Graphics with ggplot2

Statistical Transformations

Statistical Transformations

Some plot types (such as scatterplots) do not requiretransformations–each point is plotted at x and y coordinates equal tothe original valueOther plots, such as boxplots, histograms, prediction lines etc. requirestatistical transformations

For a boxplot the y values must be transformed to the median and1.5(IQR)For a smoother smother the y values must be transformed into predictedvalues

Each geom has a default statistic, but these can be changedFor example, the default statistic for geom_bar is stat_bin> args(geom_bar)

function (mapping = NULL, data = NULL, stat = "bin", position = "stack",...)

NULL> # ?stat_bin>

(Harvard MIT Data Center) Introduciton to R Graphics with ggplot2 May 10, 2013 23 / 60

Page 24: Introduction to R Graphics with ggplot2

Statistical Transformations

Setting Statistical Transformation Arguments

Arguments to stat_ functions are passed through geom_ functionsSlightly annoying because in order to change it you have to firstdetermine which stat the geom uses, then determine the arguments tothat stat

ggplot(mtcars, aes(x = mpg)) +geom_bar()

ggplot(mtcars, aes(x = mpg)) +geom_bar(stat = "bin", binwidth=4)

(Harvard MIT Data Center) Introduciton to R Graphics with ggplot2 May 10, 2013 24 / 60

Page 25: Introduction to R Graphics with ggplot2

Statistical Transformations

Changing The Statistical Transformation

Sometimes the default statistical transformation is not what you needOften the case with pre-summarized data> (mtc.sum <- aggregate(mtcars["mpg"], mtcars["gear"], FUN=mean))gear mpg

1 3 16.12 4 24.53 5 21.4

> ggplot(mtc.sum, aes(x=gear, y=mpg)) +geom_bar()

Mapping a variable to y and alsousing stat="bin".Error in pmin(y, 0) : object’y’ not found

.

ggplot(mtc.sum, aes(x=gear, y=mpg)) +geom_bar(stat="identity")

(Harvard MIT Data Center) Introduciton to R Graphics with ggplot2 May 10, 2013 25 / 60

Page 26: Introduction to R Graphics with ggplot2

Statistical Transformations

Exercise II

1 Create boxplots of mpg by gear2 Overlay points on top of the box plots3 Create a scatter plot of weight vs. horsepower4 Overlay a linear regression line on top of the scatter plot

(Harvard MIT Data Center) Introduciton to R Graphics with ggplot2 May 10, 2013 26 / 60

Page 27: Introduction to R Graphics with ggplot2

Scales

Topic

1 Introduction

2 Geometric Objects And Aesthetics

3 Statistical Transformations

4 Scales

5 Faceting

6 Themes

7 The #1 FAQ

8 Putting It All Together

9 Wrap-up

(Harvard MIT Data Center) Introduciton to R Graphics with ggplot2 May 10, 2013 27 / 60

Page 28: Introduction to R Graphics with ggplot2

Scales

Scales: Controlling Aesthetic Mapping

In ggplot2 scales includepositioncolor and fillsizeshapeline type

Modified with scale_<aesthetic>_<type>

(Harvard MIT Data Center) Introduciton to R Graphics with ggplot2 May 10, 2013 28 / 60

Page 29: Introduction to R Graphics with ggplot2

Scales

Common Scale Arguments

name: the first argument gives the axis or legend titlelimits: the minimum and maximum of the scalebreaks: the points along the scale where labels should appearlabels: the labels that appear at each break

(Harvard MIT Data Center) Introduciton to R Graphics with ggplot2 May 10, 2013 29 / 60

Page 30: Introduction to R Graphics with ggplot2

Scales

Scale Modification Examples

p6 <- ggplot(mtcars, aes(x = factor(gear), y = mpg))p6 + geom_point(aes(color = wt))

(Harvard MIT Data Center) Introduciton to R Graphics with ggplot2 May 10, 2013 30 / 60

Page 31: Introduction to R Graphics with ggplot2

Scales

Scale breaks and labels

p7 <- p6 + geom_point(aes(color = wt)) +scale_x_discrete("Number of Gears",

breaks = c("3", "4", "5"),labels = c("Three", "Four", "Five"))

p7 + scale_color_continuous("Weight",breaks = with(mtcars, c(min(wt), median(wt), max(wt))),labels = c("Light", "Medium", "Heavy"))

(Harvard MIT Data Center) Introduciton to R Graphics with ggplot2 May 10, 2013 31 / 60

Page 32: Introduction to R Graphics with ggplot2

Scales

Scale breaks and labels

p7 + scale_color_continuous("Weight",breaks = with(mtcars, c(min(wt), median(wt), max(wt))),labels = c("Light", "Medium", "Heavy"),low = "black",high = "gray80")

(Harvard MIT Data Center) Introduciton to R Graphics with ggplot2 May 10, 2013 32 / 60

Page 33: Introduction to R Graphics with ggplot2

Scales

Using different color scales

p7 + scale_color_gradient2("Weight",breaks = with(mtcars, c(min(wt), median(wt), max(wt))),labels = c("Light", "Medium", "Heavy"),low = "blue",mid = "black",high = "red",midpoint = median(mtcars$wt))

(Harvard MIT Data Center) Introduciton to R Graphics with ggplot2 May 10, 2013 33 / 60

Page 34: Introduction to R Graphics with ggplot2

Scales

Scale Modification Examples

p8 <- ggplot(mtcars, aes(x = factor(gear), y = mpg))p8 + geom_point(aes(size = wt))

(Harvard MIT Data Center) Introduciton to R Graphics with ggplot2 May 10, 2013 34 / 60

Page 35: Introduction to R Graphics with ggplot2

Scales

Scale range

p8 + geom_point(aes(size = wt)) +scale_size_continuous("Weight",

range = c(2, 10))

(Harvard MIT Data Center) Introduciton to R Graphics with ggplot2 May 10, 2013 35 / 60

Page 36: Introduction to R Graphics with ggplot2

Scales

Available Scales

Partial combination matrix of available scales

Scale Types Examplesscale_color_ identity scale_fill_continuousscale_fill_ manual scale_color_discretescale_size_ continuous scale_size_manual

discrete scale_size_discretescale_shape_ discrete scale_shape_discretescale_linetype_ identity scale_shape_manual

manual scale_linetype_discretescale_x_ continuous scale_x_continuousscale_y_ discrete scale_y_discrete

reverse scale_x_loglog scale_y_reversedate scale_x_datedatetime scale_y_datetime

(Harvard MIT Data Center) Introduciton to R Graphics with ggplot2 May 10, 2013 36 / 60

Page 37: Introduction to R Graphics with ggplot2

Scales

Exercise III

1 Experiment with color, size, and shape aesthetics / scales2 What happens when you map more than one aesthetic to a variable?3 Which aesthetics are good for continuous variables? Which work better

for discrete variables?

(Harvard MIT Data Center) Introduciton to R Graphics with ggplot2 May 10, 2013 37 / 60

Page 38: Introduction to R Graphics with ggplot2

Faceting

Topic

1 Introduction

2 Geometric Objects And Aesthetics

3 Statistical Transformations

4 Scales

5 Faceting

6 Themes

7 The #1 FAQ

8 Putting It All Together

9 Wrap-up

(Harvard MIT Data Center) Introduciton to R Graphics with ggplot2 May 10, 2013 38 / 60

Page 39: Introduction to R Graphics with ggplot2

Faceting

Faceting

Faceting is ggplot2 parlance for small multiplesThe idea is to create separate graphs for subsets of dataggplot2 offers two functions for creating small multiples:

1 facet_wrap(): define subsets as the levels of a single grouping variable2 facet_grid(): define subsets as the crossing of two grouping variables

Facilitates comparison among plots, not just of geoms within a plot

(Harvard MIT Data Center) Introduciton to R Graphics with ggplot2 May 10, 2013 39 / 60

Page 40: Introduction to R Graphics with ggplot2

Faceting

Example Data II: Titanic

> titanic <- read.csv(paste("http://biostat.mc.vanderbilt.edu/",+ "wiki/pub/Main/",+ "DataSets/titanic3.csv", sep = ""))>

variable descriptionpclass Passenger Classsurvival Survivalname Namesex Sexage Agesibsp Number of Siblings/Spouses Aboardparch Number of Parents/Children Aboardticket Ticket Numberfare Passenger Farecabin Cabinembarked Port of Embarkationboat Lifeboatbody Body Identification Numberhome.dest Home/Destination

(Harvard MIT Data Center) Introduciton to R Graphics with ggplot2 May 10, 2013 40 / 60

Page 41: Introduction to R Graphics with ggplot2

Faceting

What happened on the Titanic?

Start by transforming the data into a more usable form:> library(plyr)> titanic$age.g <- as.integer(cut(titanic$age, 10))> td <- ddply(titanic, c("pclass", "age.g", "sex"),+ summarize,+ ps = length(survived[survived == 1])/length(survived))>

(Harvard MIT Data Center) Introduciton to R Graphics with ggplot2 May 10, 2013 41 / 60

Page 42: Introduction to R Graphics with ggplot2

Faceting

What happened on the Titanic?

Basic scatter plot:

p8 <- ggplot(td, aes(x = age.g, y = ps))p8 + geom_point()

Why do we have two clusters of points?

(Harvard MIT Data Center) Introduciton to R Graphics with ggplot2 May 10, 2013 42 / 60

Page 43: Introduction to R Graphics with ggplot2

Faceting

What happened on the Titanic?

Use the techniques we already know (aesthetic mapping):

p8 <- ggplot(td, aes(x = age.g, y = ps))p8 + geom_point(aes(color = pclass, shape = sex))

(Harvard MIT Data Center) Introduciton to R Graphics with ggplot2 May 10, 2013 43 / 60

Page 44: Introduction to R Graphics with ggplot2

Faceting

What happened on the Titanic?

Use faceting in one dimension

p8 + geom_point(aes(shape = sex)) +facet_wrap(~ pclass)

(Harvard MIT Data Center) Introduciton to R Graphics with ggplot2 May 10, 2013 44 / 60

Page 45: Introduction to R Graphics with ggplot2

Faceting

What happened on the Titanic?

Use faceting in two dimensions

p8 + geom_point() +facet_grid(sex ~ pclass)

(Harvard MIT Data Center) Introduciton to R Graphics with ggplot2 May 10, 2013 45 / 60

Page 46: Introduction to R Graphics with ggplot2

Themes

Topic

1 Introduction

2 Geometric Objects And Aesthetics

3 Statistical Transformations

4 Scales

5 Faceting

6 Themes

7 The #1 FAQ

8 Putting It All Together

9 Wrap-up

(Harvard MIT Data Center) Introduciton to R Graphics with ggplot2 May 10, 2013 46 / 60

Page 47: Introduction to R Graphics with ggplot2

Themes

Themes

The ggplot2 theme system handles non-data plot elements such asAxis labelsPlot backgroundFacet label backroundLegend appearance

Two built-in themes:theme_gray() (default)theme_bw()More available on the wiki:

https://github.com/hadley/ggplot2/wiki/Themes

(Harvard MIT Data Center) Introduciton to R Graphics with ggplot2 May 10, 2013 47 / 60

Page 48: Introduction to R Graphics with ggplot2

Themes

Overriding theme defaults

Specific theme elements can be overridden using theme()

Example:

p7 + theme(plot.background = element_rect(fill = "white",colour = "gray40"))

You can see available options by printing theme_gray() or theme_bw()

(Harvard MIT Data Center) Introduciton to R Graphics with ggplot2 May 10, 2013 48 / 60

Page 49: Introduction to R Graphics with ggplot2

Themes

Creating and saving new themes

You can create new themes, as in the following example:

theme_new <- theme_bw() +theme(text=element_text(size = 12, family = ""),

axis.text.x = element_text(colour = "red"),panel.background = element_rect(fill = "pink"))

p7 + theme_new

(Harvard MIT Data Center) Introduciton to R Graphics with ggplot2 May 10, 2013 49 / 60

Page 50: Introduction to R Graphics with ggplot2

The #1 FAQ

Topic

1 Introduction

2 Geometric Objects And Aesthetics

3 Statistical Transformations

4 Scales

5 Faceting

6 Themes

7 The #1 FAQ

8 Putting It All Together

9 Wrap-up

(Harvard MIT Data Center) Introduciton to R Graphics with ggplot2 May 10, 2013 50 / 60

Page 51: Introduction to R Graphics with ggplot2

The #1 FAQ

Map Aesthetic To Different Columns

The most frequently asked question goes something like this: I have twovariables in my data.frame, and I’d like to plot them as separate points, withdifferent color depending on which variable it is. How do I do that?

ggplot(mtcars, aes(x=wt)) +geom_point(aes(y=disp), color="red") +geom_point(aes(y=hp), color="blue")

#

library(reshape2)mtc.m <- melt(mtcars,

measure.vars=c("mpg","hp"))

ggplot(mtc.m,aes(x=wt,

y=value,color=variable)) +

geom_point()

(Harvard MIT Data Center) Introduciton to R Graphics with ggplot2 May 10, 2013 51 / 60

Page 52: Introduction to R Graphics with ggplot2

Putting It All Together

Topic

1 Introduction

2 Geometric Objects And Aesthetics

3 Statistical Transformations

4 Scales

5 Faceting

6 Themes

7 The #1 FAQ

8 Putting It All Together

9 Wrap-up

(Harvard MIT Data Center) Introduciton to R Graphics with ggplot2 May 10, 2013 52 / 60

Page 53: Introduction to R Graphics with ggplot2

Putting It All Together

Challenge: Recreate This Economist Graph

Data

The data is available in the EconomistData.csv file. Original sources arehttp://www.transparency.org/content/download/64476/1031428http://hdrstats.undp.org/en/indicators/display_cf_xls_indicator.cfm?indicator_id=103106&lang=en

Graph source:http://www.economist.com/node/21541178

(Harvard MIT Data Center) Introduciton to R Graphics with ggplot2 May 10, 2013 53 / 60

Page 54: Introduction to R Graphics with ggplot2

Putting It All Together

Challenge Solution1 Load the data:

dat <- read.csv("dataSets/EconomistData.csv")

1 Create basic scatter plot

pc1 <- ggplot(dat, aes(x = CPI, y = HDI, color = Region))(pc1 <- pc1 + geom_point(shape = 1))

#

(Harvard MIT Data Center) Introduciton to R Graphics with ggplot2 May 10, 2013 54 / 60

Page 55: Introduction to R Graphics with ggplot2

Putting It All Together

Challenge Solution

Add labels

label.these <- c("Congo", "Sudan", "Afghanistan", "Greece", "China","India", "Rwanda", "Spain", "France", "United States","Japan", "Norway", "Singapore")

(pc2 <- pc1 +geom_text(aes(label = Country),

color = "black", size = 3, hjust = 1.1,data = dat[dat$Country %in% label.these, ]))

(Harvard MIT Data Center) Introduciton to R Graphics with ggplot2 May 10, 2013 55 / 60

Page 56: Introduction to R Graphics with ggplot2

Putting It All Together

Challenge Solution

Add smoothing line

(pc3 <- pc2 +geom_smooth(aes(group = 1),

method = "lm",color = "black",formula = y~ poly(x, 2),se = FALSE))

#

(Harvard MIT Data Center) Introduciton to R Graphics with ggplot2 May 10, 2013 56 / 60

Page 57: Introduction to R Graphics with ggplot2

Putting It All Together

Challenge Solution

Finishing touches

(pc4 <- pc3 + theme_bw() +scale_x_continuous("Corruption Perceptions Index, 2011\n(10 = least corrupt)") +scale_y_continuous("Human Development Index, 2011\n(1 = best)") +theme(legend.position = "top", legend.direction = "horizontal"))

(Harvard MIT Data Center) Introduciton to R Graphics with ggplot2 May 10, 2013 57 / 60

Page 58: Introduction to R Graphics with ggplot2

Wrap-up

Topic

1 Introduction

2 Geometric Objects And Aesthetics

3 Statistical Transformations

4 Scales

5 Faceting

6 Themes

7 The #1 FAQ

8 Putting It All Together

9 Wrap-up

(Harvard MIT Data Center) Introduciton to R Graphics with ggplot2 May 10, 2013 58 / 60

Page 59: Introduction to R Graphics with ggplot2

Wrap-up

Help Us Make This Workshop Even Better!

Please take a moment to fill out a very short feedback formThese workshops exist for you – tell us what you need!http://tinyurl.com/R-graphics-feedback

(Harvard MIT Data Center) Introduciton to R Graphics with ggplot2 May 10, 2013 59 / 60

Page 60: Introduction to R Graphics with ggplot2

Wrap-up

Additional resources

ggplot2 resourcesMailing list: http://groups.google.com/group/ggplot2Wiki: https://github.com/hadley/ggplot2/wikiWebsite: http://had.co.nz/ggplot2/StackOverflow:http://stackoverflow.com/questions/tagged/ggplot

IQSS resourcesResearch technology consulting:http://projects.iq.harvard.edu/rtcWorkshops:http://projects.iq.harvard.edu/rtc/filter_by/workshops

(Harvard MIT Data Center) Introduciton to R Graphics with ggplot2 May 10, 2013 60 / 60