TheJohnsHopkinsDataScienceLab February28,2017 · plot_ly(x =~precip,type ="histogram") Histogram...

29
plotly The Johns Hopkins Data Science Lab February 28, 2017

Transcript of TheJohnsHopkinsDataScienceLab February28,2017 · plot_ly(x =~precip,type ="histogram") Histogram...

Page 1: TheJohnsHopkinsDataScienceLab February28,2017 · plot_ly(x =~precip,type ="histogram") Histogram Boxplot Boxplotsarewonderfulforcomparinghowdifferentdatasetsare distributed. Specifytype

plotly

The Johns Hopkins Data Science Lab

February 28, 2017

Page 2: TheJohnsHopkinsDataScienceLab February28,2017 · plot_ly(x =~precip,type ="histogram") Histogram Boxplot Boxplotsarewonderfulforcomparinghowdifferentdatasetsare distributed. Specifytype

What is Plotly?

Plotly is a web application for creating and sharing datavisualizations. Plotly can work with several programming languagesand applications including R, Python, and Microsoft Excel. We’regoing to concentrate on creating different graphs with Plotly andsharing those graphs.

Page 3: TheJohnsHopkinsDataScienceLab February28,2017 · plot_ly(x =~precip,type ="histogram") Histogram Boxplot Boxplotsarewonderfulforcomparinghowdifferentdatasetsare distributed. Specifytype

Installing Plotly

Installing Plotly is easy:

install.packages("plotly")library(plotly)

If you want to share your visualizations on https://plot.ly/ youshould make an account on their site.

Page 4: TheJohnsHopkinsDataScienceLab February28,2017 · plot_ly(x =~precip,type ="histogram") Histogram Boxplot Boxplotsarewonderfulforcomparinghowdifferentdatasetsare distributed. Specifytype

Basic Scatterplot

A basic scatterplot is easy to make, with the added benefit oftooltips that appear when your mouse hovers over each point.Specify a scatterplot by indicating type = "scatter". Notice thatthe arguments for the x and y variables as specified as formulas,with the tilde operator (~) preceding the variable that you’replotting.

library(plotly)plot_ly(mtcars, x = ~wt, y = ~mpg, type = "scatter")

Page 5: TheJohnsHopkinsDataScienceLab February28,2017 · plot_ly(x =~precip,type ="histogram") Histogram Boxplot Boxplotsarewonderfulforcomparinghowdifferentdatasetsare distributed. Specifytype

Basic Scatterplot

2 3 4 5

10

15

20

25

30

35

wt

mpg

Page 6: TheJohnsHopkinsDataScienceLab February28,2017 · plot_ly(x =~precip,type ="histogram") Histogram Boxplot Boxplotsarewonderfulforcomparinghowdifferentdatasetsare distributed. Specifytype

Scatterplot Color

You can add color to your scatterplot points according to acategorical variable in the data frame you use with plot_ly().

plot_ly(mtcars, x = ~wt, y = ~mpg, type = "scatter", color = ~factor(cyl))

Page 7: TheJohnsHopkinsDataScienceLab February28,2017 · plot_ly(x =~precip,type ="histogram") Histogram Boxplot Boxplotsarewonderfulforcomparinghowdifferentdatasetsare distributed. Specifytype

Scatterplot Color

2 3 4 5

10

15

20

25

30

35

wt

mpg

468

Page 8: TheJohnsHopkinsDataScienceLab February28,2017 · plot_ly(x =~precip,type ="histogram") Histogram Boxplot Boxplotsarewonderfulforcomparinghowdifferentdatasetsare distributed. Specifytype

Continuous Color

You can also show continuous variables with color in scatterplots.

plot_ly(mtcars, x = ~wt, y = ~mpg, type = "scatter", color = ~disp)

Page 9: TheJohnsHopkinsDataScienceLab February28,2017 · plot_ly(x =~precip,type ="histogram") Histogram Boxplot Boxplotsarewonderfulforcomparinghowdifferentdatasetsare distributed. Specifytype

Continuous Color

2 3 4 5

10

15

20

25

30

35

wt

mpg

100

150

200

250

300

350

400

450

disp

Page 10: TheJohnsHopkinsDataScienceLab February28,2017 · plot_ly(x =~precip,type ="histogram") Histogram Boxplot Boxplotsarewonderfulforcomparinghowdifferentdatasetsare distributed. Specifytype

Scatterplot Sizing

By specifying the size argument you can make each point in yourscatterplot a different size.

plot_ly(mtcars, x = ~wt, y = ~mpg, type = "scatter",color = ~factor(cyl), size = ~hp)

Page 11: TheJohnsHopkinsDataScienceLab February28,2017 · plot_ly(x =~precip,type ="histogram") Histogram Boxplot Boxplotsarewonderfulforcomparinghowdifferentdatasetsare distributed. Specifytype

Scatterplot Sizing

2 3 4 5

10

15

20

25

30

35

wt

mpg

468

Page 12: TheJohnsHopkinsDataScienceLab February28,2017 · plot_ly(x =~precip,type ="histogram") Histogram Boxplot Boxplotsarewonderfulforcomparinghowdifferentdatasetsare distributed. Specifytype

3D Scatterplot

You can create a three-dimensional scatterplot with the type ="scatter3d" argument. If you click and drag these scatterplotsyou can view them from different angles.

set.seed(2016-07-21)temp <- rnorm(100, mean = 30, sd = 5)pressue <- rnorm(100)dtime <- 1:100plot_ly(x = ~temp, y = ~pressue, z = ~dtime,

type = "scatter3d", color = ~temp)

Page 13: TheJohnsHopkinsDataScienceLab February28,2017 · plot_ly(x =~precip,type ="histogram") Histogram Boxplot Boxplotsarewonderfulforcomparinghowdifferentdatasetsare distributed. Specifytype

3D Scatterplot

Webglisnotsupportedbyyourbrowser-visithttp://get.webgl.orgformoreinfo

20

25

30

35

40

temp

Page 14: TheJohnsHopkinsDataScienceLab February28,2017 · plot_ly(x =~precip,type ="histogram") Histogram Boxplot Boxplotsarewonderfulforcomparinghowdifferentdatasetsare distributed. Specifytype

Line Graph

Line graphs are the default graph for plot_ly(). They’re of courseuseful for showing change over time:

data("airmiles")plot_ly(x = ~time(airmiles), y = ~airmiles, type = "scatter", mode = "lines")

Page 15: TheJohnsHopkinsDataScienceLab February28,2017 · plot_ly(x =~precip,type ="histogram") Histogram Boxplot Boxplotsarewonderfulforcomparinghowdifferentdatasetsare distributed. Specifytype

Line Graph

1940 1945 1950 1955 1960

0

5k

10k

15k

20k

25k

30k

time(airmiles)

airmiles

Page 16: TheJohnsHopkinsDataScienceLab February28,2017 · plot_ly(x =~precip,type ="histogram") Histogram Boxplot Boxplotsarewonderfulforcomparinghowdifferentdatasetsare distributed. Specifytype

Multi Line Graph

You can show multiple lines by specifying the column in the dataframe that separates the lines:

library(plotly)library(tidyr)library(dplyr)data("EuStockMarkets")

stocks <- as.data.frame(EuStockMarkets) %>%gather(index, price) %>%mutate(time = rep(time(EuStockMarkets), 4))

plot_ly(stocks, x = ~time, y = ~price, color = ~index, type = "scatter", mode = "lines")

Page 17: TheJohnsHopkinsDataScienceLab February28,2017 · plot_ly(x =~precip,type ="histogram") Histogram Boxplot Boxplotsarewonderfulforcomparinghowdifferentdatasetsare distributed. Specifytype

Multi Line Graph

1992 1994 1996 1998

2000

3000

4000

5000

6000

7000

8000

time

price

CACDAXFTSESMI

Page 18: TheJohnsHopkinsDataScienceLab February28,2017 · plot_ly(x =~precip,type ="histogram") Histogram Boxplot Boxplotsarewonderfulforcomparinghowdifferentdatasetsare distributed. Specifytype

Histogram

A histogram is great for showing how counts of data are distributed.Use the type = "histogram" argument.

plot_ly(x = ~precip, type = "histogram")

Page 19: TheJohnsHopkinsDataScienceLab February28,2017 · plot_ly(x =~precip,type ="histogram") Histogram Boxplot Boxplotsarewonderfulforcomparinghowdifferentdatasetsare distributed. Specifytype

Histogram

10 20 30 40 50 60 700

2

4

6

8

10

12

14

precip

Page 20: TheJohnsHopkinsDataScienceLab February28,2017 · plot_ly(x =~precip,type ="histogram") Histogram Boxplot Boxplotsarewonderfulforcomparinghowdifferentdatasetsare distributed. Specifytype

Boxplot

Boxplots are wonderful for comparing how different datasets aredistributed. Specify type = "box" to create a boxplot.

plot_ly(iris, y = ~Petal.Length, color = ~Species, type = "box")

Page 21: TheJohnsHopkinsDataScienceLab February28,2017 · plot_ly(x =~precip,type ="histogram") Histogram Boxplot Boxplotsarewonderfulforcomparinghowdifferentdatasetsare distributed. Specifytype

Boxplot

setosa versicolor virginica

1

2

3

4

5

6

7

Petal.Length

setosaversicolorvirginica

Page 22: TheJohnsHopkinsDataScienceLab February28,2017 · plot_ly(x =~precip,type ="histogram") Histogram Boxplot Boxplotsarewonderfulforcomparinghowdifferentdatasetsare distributed. Specifytype

Heatmap

Heatmaps are useful for displaying three dimensional data in twodimensions, using color for the third dimension. Create a heatmapby using the type = "heatmap" argument.

terrain1 <- matrix(rnorm(100*100), nrow = 100, ncol = 100)plot_ly(z = ~terrain1, type = "heatmap")

Page 23: TheJohnsHopkinsDataScienceLab February28,2017 · plot_ly(x =~precip,type ="histogram") Histogram Boxplot Boxplotsarewonderfulforcomparinghowdifferentdatasetsare distributed. Specifytype

Heatmap

0 20 40 60 800

20

40

60

80

−3

−2

−1

0

1

2

3

4

terrain1

Page 24: TheJohnsHopkinsDataScienceLab February28,2017 · plot_ly(x =~precip,type ="histogram") Histogram Boxplot Boxplotsarewonderfulforcomparinghowdifferentdatasetsare distributed. Specifytype

3D Surface

Why limit yourself to two dimensions when you can render threedimensions on a computer!? Create moveable 3D surfaces withtype = "surface".

terrain2 <- matrix(sort(rnorm(100*100)), nrow = 100, ncol = 100)plot_ly(z = ~terrain2, type = "surface")

Page 25: TheJohnsHopkinsDataScienceLab February28,2017 · plot_ly(x =~precip,type ="histogram") Histogram Boxplot Boxplotsarewonderfulforcomparinghowdifferentdatasetsare distributed. Specifytype

3D Surface

Webglisnotsupportedbyyourbrowser-visithttp://get.webgl.orgformoreinfo

−4

−3

−2

−1

0

1

2

3

terrain2

Page 26: TheJohnsHopkinsDataScienceLab February28,2017 · plot_ly(x =~precip,type ="histogram") Histogram Boxplot Boxplotsarewonderfulforcomparinghowdifferentdatasetsare distributed. Specifytype

Choropleth Maps: SetupChoropleth maps illustrate data across geographic areas by shadingregions with different colors. Choropleth maps are easy to makewith Plotly though they require more setup compared to otherPlotly graphics.

# Create data framestate_pop <- data.frame(State = state.abb, Pop = as.vector(state.x77[,1]))# Create hover textstate_pop$hover <- with(state_pop, paste(State, '<br>', "Population:", Pop))# Make state borders whiteborders <- list(color = toRGB("red"))# Set up some mapping optionsmap_options <- list(

scope = 'usa',projection = list(type = 'albers usa'),showlakes = TRUE,lakecolor = toRGB('white')

)

Page 27: TheJohnsHopkinsDataScienceLab February28,2017 · plot_ly(x =~precip,type ="histogram") Histogram Boxplot Boxplotsarewonderfulforcomparinghowdifferentdatasetsare distributed. Specifytype

Choropleth Maps: Mapping

plot_ly(z = ~state_pop$Pop, text = ~state_pop$hover, locations = ~state_pop$State,type = 'choropleth', locationmode = 'USA-states',color = state_pop$Pop, colors = 'Blues', marker = list(line = borders)) %>%

layout(title = 'US Population in 1975', geo = map_options)

Page 28: TheJohnsHopkinsDataScienceLab February28,2017 · plot_ly(x =~precip,type ="histogram") Histogram Boxplot Boxplotsarewonderfulforcomparinghowdifferentdatasetsare distributed. Specifytype

Choropleth MapsUSPopulationin1975

5k

10k

15k

20k

Page 29: TheJohnsHopkinsDataScienceLab February28,2017 · plot_ly(x =~precip,type ="histogram") Histogram Boxplot Boxplotsarewonderfulforcomparinghowdifferentdatasetsare distributed. Specifytype

More Resources

I The Plolty WebsiteI The Plotly R APII The Plotly R Package on GitHubI The Plotly R CheatsheetI “Plotly for R” book by Carson Sievert