Taking R Mainstream in Production Systems

18
Taking R Mainstream in Production Systems Misha Lisovich [email protected]

Transcript of Taking R Mainstream in Production Systems

Taking R Mainstreamin Production Systems

Misha Lisovich [email protected]

The QuestionQ: Should I Use R in production?A: Yes! (In a couple of years)

The Process1. Productize

- Compelling data products- Innovation pipeline

2. Ruggedize- Toolchain: Rstudio, Devtools, Github, Travis CI, Docker- Strong testing- Production-ready Architecture

3. Assimilate- Command line tools- Make it into HTTP APIs- Make it into Docker containers

Step 1: ProductizeInternal Products:

- Ad-hoc Analyses - Internal Dashboards- Automated reports- Rapid Prototyping

External Products:- End-user data products- Backend services

1. Dashboards

Business Intelligence Internal ToolsData & Job Monitoring

2. Automated Reports

.Rmd -> html

=

3. Rapid Prototyping

4. Backend Services

Batch Data Processing (ETL)

R APIs

5. End-user Products

Step 2: Ruggedize

1. Create reproducible architecture2. Set up strong testing & CI 3. Separate Production and Dev 4. Set up monitoring & reporting

Case Study: HB Architecture

- Rstudio - Containerized Architecture- Continuous Integration- Multiple Environments- Notifications/Monitoring

Data Architecture

elasticsearch:

image: elasticsearch

shiny-server:

image: shiny

ports:

- "443:443"

links:

- elasticsearch

etl:

image:etl

volumes:

- .:/data

etl-data:

image: etl-dataETL

Shiny Server Elastic

ETL Data

SQL S3

Web

rAPI

SQL

Shiny Server

Elastic

ETL data

ETL

rAPI

Docker Compose Containers

+ =

Rstudio Server

Environments

ETL

Shiny Server Elastic

data volume

SQL S3

www.dataproduct.com

internal-dashboards.com

ETL

Shiny Server Elastic

data volume

SQL S3

staging-www.dataproduct.com

staging-internal-dashboards.com

Production Staging

Continuous Integration

Github Travis CI

commit

latest-stable tag

Production

pull latest-stable

Staging

pull latest-stableSuccess!

Docker Registry/Rolling Back

Docker Registry

ETL data volume

Changes Deployed to Prod

Save Versioned Image

Danger! Need to Rollback!

ETL data volume

Load Older Image

Docker Registry

Step 3: Assimilate!

(i.e., be kind to your devs)

Assimilate (contd)- HTTP APIs

- OpenCPU, rapier- Docker containers

- Rocker- Command line tools

- Rscript, littler, docopt

Thank you!

[email protected]